BD 163 0B5 



IB 008 225 



AOTHOP 
TITLE 

INSTII^UTIOM 
PtJB DATE 
NOTE 

AVAILABLE FROH 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Coniad, Linda, Ed, ; And Cthets 

Graduate Record Exaninaticts * lechrical Hanual* 
Educational Testing Services, PrincetcD, H,J# 
Sep 77 

112p« ; Small print narginally l€gitl€ 

Graduate Record ExaiiEaticnf, Educaticnal testing 

Service, Princeton, N€v Jersey Ces^^ ($6.00) 

HF-$0*B5 Plus Postage, hC Set Availatle fici £DES. 
Aptitude Tests ; *Cc. liege Entrance Exasinaticns; 
Content Analysis; Graduate £tud|; Highe r Education; 
*Li terature Bevievs; Hanuals; Scores; *St€tistical 
Data; Test Const r uct ic t ; *Ie£t Interpretation ; lest 
Reliability; *Test Validitj 

♦Graduate Becor d Examinations; *Iest Hacuela 



ABSTRACT 

This manual supplesentf pieviocs guides to the use of 
the Graduate Record Examinations (GFE)* It provides sufficient 
detailed information atout the gres tc peiKit measurement specialists 
a nd institutional researchers, as well as f acu Ity me&ter£ and 
administrators, to und^r^tand the deve Icp vent cf the tests and to 
eva lua te their usef ulness* Chapters include : (1 ) A Brief fiistorical 
Review of the Graduate Record Examinaticns; (2) Purposes and General 
Characteristics of the Aptitude Test atd advanced lests; (3) 
Develcpvent of the Aptitude Test; (4) rev^lcpnent cf the Advanced 
Tests; (5) statistical Methods and Analyses cf the Graduate Record 
Examinations; and (6) Validity cf the Graduate Feccrd ]:xaminaticns* 
Appendixes include information unique to the Aptitude Test and to 
each of the 20 Advanced Test£ including itea types, test 
specifications, norms, and a variety cf summary statistics* (ROF) 



* Re prod lie ti on s suppli3d by EDRS are the best that cat made * 

* from the origiral dccument* * 

**** ******4**]0c*********4c****4t**4c4**4c*4 4 ^*4 4 *4# 4 444444444 4.4 444^4:44*44**** 







r 1 
















00 




o 


1 


lo 




F 


^ 1 

ERIC 



N*TiOn*L INlYlTUTC OP 
C0liC*TlON 

TMli CtOCUMENt HAS BEEN flEf>BO- 
OUCEO EXACTLV 4S AECEivEO F f^OM 
THE PERSON OR 0»<iAN>7*TlON Oft'O'N' 
*TlN(i ' T PO'N T SO* ViP W OR OP 'N IONS 
4T*TED OO ^OT NEcrsiAHiLV HEPbE- 
SE^iT OF FiCiAL NATIONAL mSTiTUTE Of 
E out AT ro^ pOSj r 'OH oh POJ. ^ 



TECHNICAL 

MANUAL 



PtHMlijSlON TO RtPf^OCL.'CE THIS 
MATERIAL IN MICROFICHE ONLY 
MAS SEEN GRANTtUev 

.^POiJiC 

TO THE EDUCATIONAL rtESQUFtCES 
iNPORWtATiON CEMTER lEmCl AMO 
USERS OP THE EriC SySTEM " 



edited by 
Linda Conrad 
Donald Trismen 
Ruth Miller 



EDUCATIONAL TESTING SERVICE, PRINCETON, NJ 



GRADUATE RECORD EXAMINATIONS 
TECHNICAL MANUAL 



edited by 

Linda Conrad, Donald Trismen, and Ruth Miller 



PRIMARY AUTHORS 

Robert Altman 
Linda Conrad 
Ronald Flaugher 
Raymond Thompson 
Madeline Wallmark 
Warren Willingham 



EDUCATIONAL TESTING SERVICE • PRINCETON. NEW JERSEY 



Reviews or other assistance at various staQes in preparation ot the 
tex( were provided by: Richard Bonis. Eleanor Coiclooghn Beth 
DraKen Robin Durso. Barbara Esser. Susan Jackson. Miles McPeek^ 
Lois Nievesn Donald Powers. Gary Saretzky. vlanis Sorr^ervHIe. Eliza- 
beth Stewart. Stanford von Mayrhauser. Cheryl Wild. Irene 
WilliamsH and John Winterbottom. Final reviews were made by 
Thomas Donjon and Wrniam Sctiracier ol EciycationaJ Testing 
Service, by Gerald Lannhoim. formerly oi ETS and one of the first 
directors of the Gradualo Recora Examinations Prooramn by Gene 
Glass Of the University of Colorado, and by Nancy Cole of th® 
University Of Pittsburgh. 



Copies of this publication are available for S6 per copy from 
Graduate Record Examinations Program* Educational Test- 
ing Service. Princeton. NJ 06541, Please enclose payment 
with your order. 



Tne Graduat J Record Eiiam.naTiOn*! GoarO Edocahon^] Tenting Sf^rviCA arfl dedicnreO to the PnnciPw oi A<ivai oPooiurntV. 
and in«it Programs, services ar^d ^mpioymflnt po^iCitis $rfr 0Ui<*e<l by th»t p'tacipi* \ 

Copyright c 1977 by Educational "Testing Service AH rtghts resorved. / 

Ltbrary of Corrgr©« Csta/ogu© Card Number: 73-66796 



ERIC 



FOREWORD 



The Graduate Record Examinations Technicat Manual is mlended to supplemei^t the Ouids to 
the Use of the Graduate Record Examinations, which itself has served as a technics! manual for 
GRE user's Jor a nurrtber of years. The Outde contains information essential to test users con- 
cerning the GRE score scale, timits to the accuracy of scores, and test validity; and it sets lorth 
guidelines to appropriate .use of scores, discusses special score mterpretatton problems, and 
presents tables of interpretive data based on the toial GRE population as well as on subgroups 
defined by educational tevel and by ma^or field. Also included in the Guide is backgrnund ma- 
terial describing the GRE Board, the poticy-maKing body oi the testing program, and the various 
services and publications o) the GRE PrograirL. 

The purpose of the Technicat Manual is to provide sufficient detailed information about the 
GRE to perrnit measurement specialists and institutional researchers, as well as faculty members 
'and administrators, to understand fully the development of the tests and *o evaluate their use)u!- 
ness. 

In 1977-76. a year when de^^artments and institutions are evaluating the >^^^ analytical ability 
measure in the Aptitude Test lor its usefulness as an indicator of yet another aspect of developed 
ability, publication of the Technical Manuaf 's considered particularly important. It describes 
some of the extensive research that led to the introduction of the *"rst major change in the Ap- 
titude T^st since the 19405, when the GRE verbal and quantitative ability measures were in- 
troduced. 

The Manu3f has been written^ insofar as possible^ in nontechnical language so that it may be 
read by the average test user as well as by specialists in measurement. Every eflort has been 
made to include all details necessary for Careful evaluation of the Graduate Record Examina^ 
tions. However. GRE research still m progress and the findings of other testing programs have 
been generally considered outside the purview of the Manuai Likewise, an exhaustive history of 
the GRE Program has not been attempted; rather a brief historical review sets the stage for a dis- 
cussion that focuses on the current tests from a historical perspective, jhe descriptions pro- 
vided in this edition of the Manual, while intended to be definitive, cannot be considered final. As 
the Manual itself reflects, the Graduate Record Examinations have shown a continuous history 
of change, growth, and 3dapt?'jor. As data become available on the new analytical ability 
measure and as the results of receni < ^search can be reported, supplements to the Manual witi 
be prepared and distributed 

Richard Armitage.Cha/rmari, GRE Board 

Lyie Jones, Chairman. GRE Board Research Committee 

September 1977 



r 



CONTENTS 



Foreword. . . . , iii 

Chapters 

1 A Brief Hlsto*i(cal Review of the Graduate Record EKamlnaHons 1 

^ 1, Purposed ancL General Characteristics of the Aptitude Test and Advanced 

Tests \. \ 4 / 

Multiple^Cholce Foyrnat 4 1 

TestlnstTUCtions .V 4 ) 

Formula Scoring .,.A 4 / ■ — -'^ 

Test Development StafI V 5 / 

Test Devetooment PTOneb\i(^s ^ ^ 5 / 

Test Assemf^ly — \ ^ ^J^ 

QuaUtyContrut ^r^-^. , . . , .^^^^^^ ^^,rt-r^. \ 1 

Testing StarxJarcJs ^"^^T-rT'Tr"^^ 7 

References 6 

3. Developme it of the Aptftude Test 

Evolution of tie Aptitude Test 9 

General Formal 10 

Content Characteristics ... . 10 

Verbal Ability Measu're 10 

Content Specifications fcr the Verbal Ability Measure 13 

QuantJtat;ve Ab;Hty Measure 

Content Speciflcaiions for the Quantiiativt* AbHity Measut'e 16 

Analytica' Ability Measure 16 

Content Sf eciticafions for the Analytical Ability Measure . . . . : 18 

St£>tisf ical Characfei istics 18 

Statistical :5pecifi cations 20 

Relationship of Statistical Analysis and Research to Test Specifications 20 

Standard / cfivities 20 

Special Activities 20 

Research Melated to Restr ucturing the Aptitude Test , . 21 

References 23 

4. Devefopmerft of the Advanced Testa 24 

Uses . . 24 

Format 2G 

Committees Of Examiners 28 

Content Specifications 27 

Statistical Specifications and Characteristics 28 

Information UniQue to Each Advanced Tost 31 

Reference 31 

5. Statistical Methods and Analyses of the Graduate Record EKamlnatlons 32 

DevelopsTient of the GRE Scated'Score System 32 

Scaling ot the Analytical Ability Measure 33 

Score Equating and Related Concerns 34 

Subscor© Scaling .... 36 

Stability of the Scale 36 

Stability of the ScaJed-Score System 37 

Reseating Study 01 1967-68 38 

Retiability and Error of Measurement 39 

^tem and Test Analysis ... 39 

item Analysis 39 

Test Analysis . 41 



ERIC 



Descriptive statistics 45 

Basic Normative Data , 46 

Descriptive Stat'iitics for tf>e Aptitude Test 48 

Other PecJOrs Interactiiig with Aptitude Tes* Per forn^aTice 49 

(References 51 

6, Validity of the Graduate Record EKemlnatlons 52 

Coniem Validity 52 

CoWrucl Validity 53 

Criterion-Related Validity 54 

Predictive Validity 54 

Other Evidence of Criterion-ReUted Validity 58 

Poputation Validity 58 

Validity Studies Summarized in Discussion of Predictive VaMdily 60 

Rfc^ferences . . 63 

AppendUes 

L Four Types of Questions Studied but Not Selected for Use In the Analytical 

AblMty Measure of the GRE Aptitude Test 64 

Letter Sets 64 

Logical Reasoning , , 64 

Evaluation of Evidence - * ■ ■ 6C 

Deductive Reasoning 66 

IL Informetlon UnlqtB to Each Advanced Test 68 

Biology , 68 

Chemistry 70 

Computer Science * 72 

Economics , . 73 

Education 74 

Engineering 76 

French 78 

Geography 79 

Geology 80 

German 82 

History 83 

Literature in EngJish 85 

Mathematics 87 

Music 39 

Philosophy 90 

Physics , . 92 

PoMlical Science 94 

Psychology 95 

Sociology lOO 

Spanish 101 

References 102 

IndeK 103 



FIGURES AND TABLES 

Figures 

1: Level of Performance of IMSF First-Year Engineering Applicants 37 

2: Level of Performance of NSF First-Year Mathematics Applicants 38 

3: ttem Analysis Sample 40 

4: Criterion Score Conversion 40 



5: Relationship beJwoen P + and A when M„, = 13.0 41 

6: Usefifiness of GRE Advanced Test Scofas for Predicting Ph,0. Attainnr^erit in Three 

Fiefda ^Creager. 1965f V 57 

7: Equated Oiffictilty of Four Types of Reading Corriprehension Itenr^s in the GRE Aptitude 

Tea* for Black Fenr^ales and a Representative Reference SarriPle 60 

Tables 

1; Specif (Cations for Discrete Verbal Questioris 13 

2: Specifications for Reading Corri prehension Passages 14 

3: SpecificaJions for Reading Compfehenaion Questions K 

4: Quantiteitive Specrficaiions lor the QRE Aptitude Test 16 

5: Statistical Characteristics of Five Recent Prior Fornr^s of the GRE Aptitude Test 19 

5A: Statistical Characteristics of the First Two Restructured Fornr^s of the GRE Aptitude 

Test , 19 

6: Slalislicai Specifications for the GRE Aptitude Test 20 

7: A Comparison of Various Experimentaf QuesUon Types 23 

B: Poficies of Graduate School Departnr^ents Listed in the Graduate Programs and Ad^ 
nr^^ss/ons Mar)u8l on Use of the GRE Advanced Tests for the 20 Fields of Study 

Whoso Nanr^es Match or Closefy Match Those of the Tests 25 

9: Number of Questions in GRE Advanced Tests and Average Testing Tinr^e per Ques* 

tion 26 

10: Slalrsf^cai C ha racf eristics of the Advanced Test TotaJ Scores 28 

1 1: Statistical Characteristics Of the Advanced Test Subscores 29 

12: Correlations among Scores on the Advanced Tests for which Sobscores Are 

Reported . - - 30 

13 Correlations of Advanced Test Scores with Aptitude Test Scores^ 1967-6a 31 

14' Advanced Tests Available 31 

t5 Scafed'Score Means and Standard Deviations of the i952 Standardization Group ^ ^ ^ 33 

16^ Scaied*Score Means and Standard Deviations of the 1967-68 Rescafing Sam pies 38 

17: Total Score Distributions 42 

18: Subscore Distributions 43 

19A: Scoring formulas and Reliability Coefficients 43 

19B: Intercorreiations 43 

19C: Speed ednesG of Test 43 

20; Score Distributions 44 

21: Frequency Distributions of Original Deltas and Biserial Correlations. t)y Score 45 

22: Item Distribution Sheet 45 

23: Summary Statistics for Total Score 46 

24: Summary Statistics for Subscores 47 

25: Frequency Distributions for Afl 1975-76 Examinees Who Intended to Major in Mi- 
crobiology i 47 

25: 1970-71 Exanr^iree Volume for the Advanced Tests, by Educational Level 4$ 

27: Aptitude Test Performance of Seniors and Nonenroliod College Graduates Classified 

b^ Undergraduate Major FieJd 48 

28: Aptitude Test Performance of Seniors and Nonenrolled College Graduates Classified 

b^fntended GraduateMajor Fiefd _ 50 

29: Sunr^mary Statistics for the Aptitude Test Performance of Seniors and Nonenrolled 
College Graduates Tested in October 1977. Classified by Undergraduate Major 

Field 51 

30: Aptitude Test Performance of Seniors and Nonenrolled Coife{ie Graduates. 

Classified by Graduate Degree Objective 51 

3l: Aptitude Test Perfornr^ance of Seniors and Nonenrolled College Graduates. 

Classified Citizenship and by Prinr^ary Language 51 

32: Median Validity Coefficients for Various Predictors and Criteria of Success in 

Graduate School 56 

33' Median Validity Coefficients for Five Predictors of Gradi^ate Success in Nine Fields . . 56 
34: Correfati^ns of Verbal and Quantitative Ability Scores with Seif-RepOrted Under- 
graduate Grades, and Related Scaled Score Means and Standard Deviations 59 

3S: Correlations of Verbal and Quanlilative Ability Scores with Self-Reported Under- 
graduate Grades, and Refatod Scaled Score Means and Standard Devrat/ons. 

December 1975 58 



\ ) 



Chapter 1 

A BRIEF HISTORICAL REVIEW OF THE GRADUATE RECORD EXAMINATIONS 



The Graduate Record Examinations. Known as the Cooperative 
Graduate Testmg Program until 1940. were an outgrowth of a 
proiect fi/nded by the Carnegie Foundation for ihe Advancement of 
Teaching in the early I930sto study the outconnes of coKege educa- 
tion. This projectn ttie Pennsytvanra Study/' was tne first large- 
scale attempt to measure academic achievennent in higher educa- 
tion by the use of objective mullipte^chcice tests. 

Anticipatmg a large increase in numbers of applicants for 
graduate study as the Depression came to an end, the Carnegie 
Foundat*on and Columbia. Harvard, Princeton, and Yale 
Universities continued the work with financial support (ronn the 
Carnegie Corporation of New York. Facull^- committees drawn from 
the four universitie:; developed tests iir»tended to measure students' 
intdttectual growth and development both through study of the 
Mberai arts and through mastery of speciali.red f^etds. 

The original test battery consisted of eight profile" tests in 
mathematicSn physics, chemistryr biology, social studies, literature, 
fine artSn and a "verbal factor. ' These tests were administered for 
the first tinne in October 1937 to first-year graduate students at 
Columbia. Harvard, Princeton, and Yale. Since the profile tests 
were not completely appro;.ifiate tor measuring one's state of learn- 
ing in a discipline or mi*jo^ field of study, wo^k was begun JO 
devetop 16 ' advanced" tests These were first adnninistered in the 
tall of 1939. 

Interest in the teats spread rapidly. In the early years ot the 
program, validity studies were carried out at the four initial partici'^ 
pating unive/'sities and also at Jn^^ana a/^d V.i^defbiit Universities, 
the Stale University of lowa. and the Universities of M*chigan. Pitts- 
burgh, and Wisconsin, By 19^0. the tests had been administered to 
more than 27.000 students :n 14 graduate and 26 undergraduate in- 
stitutions, and results were pronnising er^ough to cause widespread 
consideration to be given to the use of GRE scores as part of the 
credentials to be presented for admittance to graduate school. In 
1942, the Carnegie Corporation said in its annual report that "the 
examination scores alone are approximately as usefut as transcript 
records taken a[of'»eH and the two combined in a manner which uses 
the test resufts as a supplement to other evidence of students' 
qualifications yield a better basis for ciassifying students than 
either one used alone" (quoted by Howard J Savage in Frur, of an 
trrtpuise. p. 29i), 

Th« Testing Modes 

Prior to 1942. the Graduate Record Examinations were given solely 
through "cooperating ' institutions— that is. in the so-called ' msti- 
tutional nnode." 1t> 1942. however, the increasing ijse of the exartii- 
nation^ as part of 'he process of admission to gra<fuate study led to 
the estaDlishrtient of the first test centers at which students not 
enrolled in the testing institution could take the tests. The gradual 
Shift toward use in admissions was reflected m the number of 
undergraduate and gradijate students tested: in 1936-39. 1»131{23 
percent) o/ the 3.969 students tested were undergiaduates, whife by 
1941-42. undergraduates accounted for 5.312 (67 percent) of the 
7.936 students taking the GRE. The Independent Student Testing 
PrOgrarti was therefore initiated in 1942-43, and m the first year 135 



students were tested via the individual mode ' at 35 testing loca* 
tiOns. 

After the Second World War. as the number of students returning 
to academic study increased^ so did the numt^er of students taking 
the Graduate Record Examinations^ In 1944-45. 6.446 students 
took the GRE; by 1948-49. the annual numberhad grown to 51,231 

During the same period, the emphasis of the Institutional Testing 
Program shifted. Initially, it was the mechanism through which 
graduate institutions tested their own enrolled first-year graduate 
students, but over the years it was increasingfy use<f by Institutions 
to assess the educational accomplishments ot their undergraduate 
students. To accomodate this particular need of undergraduate 
schoois. the Tests of General Education were ir>troduced in 1946. 
According to the February i947 Buitatin of the Graduate Record 
Examinations, the new instruments were designed "to measure as 
directly as possible the attainment of important objectives of 
general education at the college level" fp- B), The Profile Tests 
continued to be offered through both the Independent Stt^dent 
Testing Program and the Institutional Testing Program, but their 
use was to be ''restricted to graduate and professional students and 
to applicants to such schools." according to XheBuUetin {p. 7): and 
' underQraduafe colleges administering the Graduate Record 
Examinations for purposes of general guidance and appraisal are 
required to administer the Tests of General Education rather than 
tne Profile Tests" (p. 8). 

Content of the Testing Program 

^n 1949» the GRE Aptitude Test was introduced as a regular part of 
the Graduate Record Examinations Prograrti. leading to modifica- 
tions in both the Profile Tests and Tests of General Education as 
their emphasis on general verbal and Quantitative abiJities was 
reduced. The Aptitude Test, first administered as the Graduate Ap^ 
titude Test In a i946 experiment, generated two scores: a verbal 
ability score and a Quantitative ability score. With its introduction, 
the last basic piece of the Graduate Record Examinations Program 
as it is known today was in place. 

In January 1948 the Graduate Record Examinations became the 
responsibility of the newly established Educational Testing Service^ 
Almost immodiatery. liaison was established with a newly created 
Committee on Testing of the Association of Graduate Schools 
(AGS) in the Association of American Universfties. which worked 
with the GRE Program office to review the tests and services of- 
fered. In 1951. the name of the Independent Student Testing 
Program was changed to the National Program for Graduate 
School Selection, and changes in the test offerings continued as 
the needs of both the National Program ancf institutional Program 
were Cnntinually reevaluated- 

With the growth in the utility and use of the new Aptitude Test, the 
Profile Tests were discontinued in 1953 in the National Program 
and in 1954 in the Institutional Testing Program. In that same year, 
the tnstitutionat Testing Program also discontinued the Tests of 
General Education, replacing them with the Area Tests, a 
comprehensive appraisal of college students' orientation *n three 
pnncipal areas of human culture: social science^ humanities, and 



ERIC 



nalurdi science. Over the course of several years. Advanced Tests in 
education. or>gir>eefing, ar>d Spar>ish were introduced, and Ad* 
vanced Tests \*> agriculture, fine arts, Qerman^ and home eco* 
nomtcs were discontinued^ 

By 1964, the GRE Program included the Aptitude Test and 16 Ad- 
vanced Tasts In biology, business, chemistry, economics, educa^ 
ticn. engineering. French, geology, government, history, literature, 
mathematics, philosophy, physical education, physics, psychology, 
sociology, and Spanish. In 1965, new Advanced Tests in music and 
speech were introduced; in subsequent years Advanced Tests in 
geography (t966). anthropology {^9BQ}: and German {1970), were 
introduced, and in 1970 the tests in business and physical educa- 
tion were discontinued. After reconsideration, the new tests in 
speech and anthropology were also discontinued m 1970 and 1971. 
respectively^ 

By 1972, the basic GRE test offerings had evolved into 19 Ad- 
vanced Tests and the Aptitude Test with verbal and quantitative 
sechons. Since that date, approximately 300,000 people have taken 
the Aptitude Test each year, wit^i mean scores based on all test- 
takers ranging between 497 and 492 over the five-year period since 
1972. A complete review of the existing Advanced Tests was carried 
out between 1970 and 1972, resulting in the availability of sub^ 
scores as well as toiai scores for nine of the tests. Then, beginning 
in 1974. a series of studies relating to the possit>te restructuring of 
the Aptitude Test was undertaken; the results in October 1977, was 
a new Aptitude Test including the basic verbal and quantitative 
ability measures and a measure of analytical abHity as well. Thus as 
of 1977. the program's basic test offerings include the Aptitude Test 
with verbal, quantitative, and analytical sections and 20 Advanced 
Tests, the twentieths computer science, havtng been added tn i976. 

The GRE Board 

By the mid-1960s, the growing tmportance of the Graduate Record 
Examinations created a need for greater participation by Ihe 
graduate school community in setting policies. In 1965, ETS, the 
AGS Committee on Testing, and the Committee on Testmg of the 
Council of Graduate Schools (CGS) in the Untied States jointly pre- 
pared a Proposal for the Establishment of a Graduate Record 
Examinations Board. The proposal began with the following state- 
ment: ' The use of the GRE has increased significantly and. al^ 
though ETS has always sought the advice of the graduate schools 
and their faculties in the development and administration of the 
program — most notably through the AGS Committee on Testing— 
the increasing use of the GRE, and the likelihood that the increas- 
ing trend towards graduate study will accelerate that use. makes it 
appropriate that a closer relationship between the graduate 
schools and the GRE be considered" (p. T), 

As a result of that proposal and actions taken by the Executive 
Committees of both AGS and CGS, the Graduate Record Examina- 
tions Board was created effective January i. 1966. Consisting of 
four members appointed by AGS, (our appointed by CGS, and eight 
appointed at large by the board itself, the new board soon signed a 
compact with ETS that outlined the board's responsibilities and 
ETS's agreement ' to vest in the board authority over the general 
policies of the GRE" (p. 1). 

Since i966 all major decisions concerning the offenngs and 
administration ot the GRE Program have been made by the GRE 
Board, drawing on professional persoi^nef at ETS as board staff 
One of the board s first major decisfons was lO limit rts concerns 



and policy control to the Natfonal Program for Graduate School Se- 
lection, and this decision was implemented in October 1969. (The 
Institutional Program was continued by Educational Testing 
Service as the Undergraduate Program.) Effective in October 1969 
also, Ihe GRE Board instituted the Local Administration Service to 
enable graduate schools to administer the GRE to their own 
enrolled graduate students for purposes ot evaluating students or 
programs, selecting students for more advanced progrems, and 
other nonadmission purposes. (The Local Administration Service 
will be discontinued after June 1979 because of declining graduate- 
school interest,) The Special Administration Service, which enat>ies 
students to take the GRE ot) dates other than those oi National 
Administrations, had begun in New York City in the 1940s. Addi- 
tional Special Administration test centers in other targe cities were 
opened in subsequent years, to a total five by 1967 and eight by 
1973. 

As provided for by its bylaws, the board soon developed a com- 
mittee structure and. reflecting a priority that has continued op to 
the present time, created as its first standing committee, a Commit- 
tee on Research and Test Development, Board-sponsored research 
projects addressed a wide range of issues and questions relating to 
the transition from undergraduate to graduate study and graduate 
study itselt, as well as matters more directly related to tne GRE 
Program of tests and services. Early studies included investigation 
of alternate methods of equating QHB. tests (1969) and of the use of 
Bayesian statistics to facilitate validity studies {1970), as well as sur- 
veys of existing graduate admissions and fellowship selection 
policies (1970) and of programs available for disadvantaged 
students (1973). More recently, board projects have included 
studies of male and female doctoral students, the identification of 
dimensions of quality in doctoral programs, and several projects re- 
lating to possible modification of the GRE Aptitude Test. 

Other Activities of the GRE Board 

In 1973 the GRE Boa.d entered into an agreement with the College 
Entrance Examination Board and ETS to create a new policy group 
for the Test of English as a foreign Language (TOEFL) to which the 
GRE Board appoints three repiesentatives. In similar fashion, the 
GRE Board participates in the governance of the Graduate and 
Professional School Financial Aid Service (GAPSfAS) by appoint- 
ing three representatives to its Council. In i976, as an outgrowth of 
a spec'31 study committee jointly created by ETS and the GRE 
Board, the Undergraduate Assessment Program (UAP) Council was 
Created to assume policy direction for that program, the revised 
and redesigned descendant of the former GRE Institutional Testing 
Program that had been administered by ETS as the Undergraduate 
Program since 1969. 

Between 1937 and i967, activities relating to the Graduate 
Record Examinations focused almost exclusively on the tests and 
the arrangements for their administration; since 1967, :he GRE 
Board has broadened the services and activities of the GRE 
Program to include research, guidance publications {Graduate Pro- 
grams and Admissions Manual \1972 ff] and Thinking about 
Graduate School l?975l), a Minority Graduate Student Locater 
Service (1973), and numerous special projects, surveys, and 
conferences concerning issues in graduate education. 

As the Graduate Record Examinations begin Their fifth decade— 
and their second under the direction of the GRE 8oari:J~they 
continue to play an important role in the admissions process to 



2 



Amdrfciin graOuat^ education; equally impurtarrt thay now provide 
th« base from which e broad program of rafated research, publica- 
tions, and services has evolviad. 

References 

A Compact between the Gradunte Record Examinations Board and 
Educattonat Testing Service, April 1966. 

A Proposaf tor the Establishment of a Graduate Record Examtna- 
tions Board. Prepared by Educational Testing Service in consul 



lat'on with the Committee on Tasting* Association of Graduate 
Schools* and the Committee on Testing* Council of Graduate 
Schoot9 in the United States* March 1965. 

Savage. H, J, Pruit of an imPuise: forty-live years of the Carnegie 
foundation 1905-1950. New York: Harcourt* Brace and Co., 
1953. 

Vaughn* K, W The Graduate Record Examination: A statement of 
Policy to cooperating cottages and universities (The Graduate 
Record Office Bulletin Number 1). New Vork: The Graduate 
Racord Office- February 19^7, 



Chapter 2 



PURPOSES AND GENERAL CHARACTERISTfCS OF 
THE APTITUDE TEST AND ADVANCED TESTS 



Tbe GRE Aptitude Test and Advanced Tests are provided to aid 
prospective graduate sludenis and inslilutions in ihe application 
and admissions process. Stt/c^ents take the tests, usually iri 
response to an institutional or departmental requirement. lo 
provide information in addition to undergraduate grades and other 
indicators of past and potential performance. 

The examinations are administered several times ayear. both na- 
tionally and in foreign countries, under standardized conditions. 
Scores are usually reported from four to six weeks after each test 
administration. Detaiied information related to the administrations 
is found in the GRE information Butfetin. a new edition of which is 
published annuaNy. 

To prevent the contents of a given test frpm becoming common 
knowledge end to assure that scores are authentic, three primary 
methods are used: 1) strict regulation of the handling of te^^t ma- 
terials so that students will not have an opportunity to see the test 
except during ils administraHon: 2) provision of multiple parallel 
forms of the test (the number of forms determrned by the number of 
students taking the particitlar test) to reduce the chance that a 
student witt take the same test twice and make more difficult possi- 
ble attempts to divuige test content; and 3) administrative 
procedures to curb impersonation and copying. In addition, the 
multiple scores ol repeaters (students who take the test more than 
once) are checked for statisticatly unlikely differences^ Such 
unusua! cases are investigated by the ETS Security Office to de- 
termtne whether an irregularity has taken p^ace/ Reports from 
students or supervisors of suspicious behavior are also thoroughiy 
Checked/ Scores thai prove to be inauthentic are either not 
reported or canceled. 

The test scores are intended for use along with other indicators 
o( student performance by graduate departments, graduate ad- 
mission committees, and fellowship sponsors in making admission 
decisions or awarding fellowships. Ihe Guide to the Use of the 
Graduate Record Examinations, which this manual supplements* 
sets f'^rlh program policies concerning who may receive the scores. 
The Quide also provides all necessary information for property in- 
terpreting and using the scores reported for the Aptitude Test ar\d 
Advanced Tests 

Multiple-ChoJce Format 

Afl tests now offered by the GRE Program are in a multiple-choice 
formal Although they have certain limitations, modern objective 
tests can present challenging intellectual tasks to examinees as 
well as measure factual knowledge, A prime advantage of multiple- 
choice tests over free-response tests is that they permit wider and. 
hence, more accurate sampling of material in a given period of 
time> Thus, more measurements of different facets of examinees' 
thinking are secured per unit of time. A second important ad- 



'Procertyrfta fof PfOreChrtg mo ^hUdeni urdcf invf»*i!igr*"n-jii .mJ for A-i^urinq ihe 
aulh«ntkitv ot scores are deSCfiOod m £TS Picc&lur&s tor Dew/o^tning sftf vnttdity of 
O^sttonal^f^ Scores 



vantage is the elimination of one source of measurement error In a 
free-response test, measurement errors are associated with the 
test, the examinee, and the grader. In multrpie-choice lasts, one of 
these sources of error, the grader, is eliminated. These two ad- 
vantages result in higher reliabilities for multiple-choice measures 
A third advantage is their practicality; the tests can be scored by 
machine so that scores can be reported more quickly and more 
economically than would otherwise l;>e possible. 



Test Instructions 

The general instructions lor taking the tests are intended to suggest 
a widely applicable tesr-takfng strategyn to alert the student to the 
multiple-Choice format, and to describe the method of scoring, 
which corrects for random guessing* These general instruction's 
are provided for the students to read at the beginning of the testing 
sessions* (The complete text of the instructions is in ihe GRE 
Sample Aptitude Test made available by the GRE Program.) 

The instructions specific to and immediately preceding group.s of 
each type of question are provided within the tirr^ed sections and 
are made as concise and clear as possible. Questions that have 
answer choices unique to them tend to require very brief instruc- 
tions. However, fixed-formal questions (those for which a fixed set 
ot answer choices applies to all or to a group of questions) tend to 
require longer instructions. The fixed-format questions* however, 
are relatively less time-conSumlng than most of the unique answer- 
choice questions* In cases where the method of solving a problem 
or the criteria a student must use to evaluate material must be es- 
tablishedn examples are included as part of the instructions. 



Formula Scoring 

All questions that contribute to a given score have equal weighL To 
eliminate the potential advantage ot random or haphazard guess- 
ing, formula scoring is used. The formula used for computing 
W 

scores on the tests is R - — ..where 

R is the number of right answers 

W is the number of wrong answers, and 

K is the number of answer choices per question^ 

For most of the tests (all except part of the quantitative section of 
the Aptitude Test, the Advanced German Test* and part of the Ad- 

W ' 

vanced Spanish Test) this formula becomes ^ ^ since each 

question has five answer choices. 

The rationale for the use of the instructions on guessing and ttie 
use of the scoring formula rests on the most likely result of pure 
guessing^ If an examinee makes random or haphazard guesses on 
each five^^choice question iri a test, the most likely result is that one- 
fifth ot the questions will be answered correctly and four-fifths of 
the Questions incorrectly, by chance alone. In the most likely out- 



erJc 



12 



come, application of the rofmula ^ ~ ^ yields the Intuitively 

reasonable score of zero. 

All other Outcomes, from getting afi the<TuestJons right to getting 
all the<Tuesti0h9 wrong, can also be the result of pure guessmg. For 

hV 

example, th^re is a probability of (^l that an examinee would 

all r* x^u^ions in a teat of fJve-choice <TUestion5 right through pure 
gu«$sing. That is a vanishingly small probabihty for a test of 100 or 
more questions; nevertheless. It is possible to get all ooestions 
right through haphazard guessing, and application of the correc- 
tion forrriula in that case would not alter this highest possible score. 
For a random gciesser taking a^lOO-questiOn test, with each ques- 
tion counting 1 raw score point, the probability that a corrected 
score will be greater than 5 points \s only 1 chance in 6. greater than 
10 only 1 chance in 40. and greater than 15 points only 1 chance in 
740. 

As soon as the examinee begms to read the <Tuestion& and bring 

some knowledge to bear in answering themr one leaves the realm of 

pure guessing. If the examinee can rule out one of the five answer 

Choices as wrong and guesses the answer from among the four 

remaining options, the probability is 1 chance in 4 that the 

examinee will gain 1 Pomt by re&poncJing correctly^ and the 

probability is 3 chances in 4 that the examinee wili lose 1M point by 

1 1 3 13 

respondmg incofrectly. Since i '< t = i and \ ^ ^ the odds 

4 4 4 4 Id 

clearly favOr answering a question in which one or more of the 
answer choices can be ruled out as wrong. Thus, the general 
instructions, to be read before beginning the test, advise the 
examinee: "It rs improbable . . . that mere guessing will improve 
your score significantly: it may even lower your score, and it does 
take time^ II. however, you are not sure of the correct answer but 
have some knowledge of the Question and are able to eliminate one 
or more of the answer choices as wrong, your chance of getting the 
right answer is improved, and on the average it wii; be to your ad- 
vantage to answer such a <Tuestion ' 



Test Development Staff 

The professional staff primarily resoonsible for the content of the 
GRE Aptitude Test amJ Advanced Tests generally have advanced 
degrees in fields refated to the tests they develop: for exampfe^ 
those responsible for the verbal measure m t^ia Aptitude Test tend 
to have bacf^grounds in the humanities or in measurement Those 
responsible for the mathematics or quantitative portion tend to 
have advanced degrees in mathematics or a related fiefd. Responsi- 
bility for the analytical measure is shared by people with hu' 
manJties, science, and mathematics bacf^grounds. some of them 
with formal trammg in logic. 

Test speciaNsts usually have considerable experience in test 
devetopfTTent and have training in psychorrietric principles and 
techniques as they refate to test construction. Experience in teach- 
ing ia quite common. Most of the test development staff maintam 
close contact with experts in their respective fie(ds and have ini. 
mediate access to test-related research carried out at ETS and 
conducted outside the organization (the Brigham Library, located 
at ETS, has a iarge collection of test-reiated materials). Persons 
preparing the reading comprehension materials for the Aptitude 
Test, for example, regularly receive numerous nontechnical peri- 
odicals written at a leve\ appropriate for the students being testecf 



ERIC 



Test development staff members also have access to the Firestone 
Library at Princeton University and other nearby educational ^frsti- 
tution libraries as well as to ERIC, a computer-accessed library of 
ed u c a tf ona J resear c h , 

In preparing the Advanced Tests in particuiar fields, test spe- 
ciahsts take the initiatrve *^ securing new questions, obtaining re^ 
views of new questions and tests, arranging for committee meet- 
ings, planning meeting agendas and working schedules, and shep- 
herding a test through assembly, editing, and production. The role 
of the test specialists in working with the committees of examlnerSr 
consisting of experts in the respective fields, partly depends on 
their background, experience, and personaNty, 

The test specialists represent a wide cross section of educated 
persons in the United States, including men and women who come 
from various regions of the country, have different religious and 
ethr7ic backgrounds, and have teen educated ifi small as welt as 
large, and public as well as private, institutions. A diversified staff is 
considered especialfy importenl because the tests must be made 
appropriate for a large, heterogeneous population. Approximately 
equal numbers of men and women are teal specialrals for GRE Ad- 
vanced Tests: minority group members are also represented. 

'n an attempt to achieve even greater diversity, the test develops 
ment staff hires writers outside the organization to supplement ma- 
terial generated at ETS. These may incfudd former staff members, 
recommended advanced graduate students who are close to the 
academic activities of the population for which the test is devel- 
oped, and faculty members with a prolessionai interest in testing. 



Test Development Procedures 

Methnds of generating material are unique to each writerr but 
icrr ^1 standardized procedures have been developed to guide the 
generation process, to assure uniformly high quality materiai. to 
avoid idiosyncratic questions, end to encourage the development 
of test material thai i£ widely appealing. 

An important part of the generation of test material is the review 
process. Each question developed, as well as any stimulus material 
on which questions may be basec be reviewed by seversl in- 
dependent critics, (n the review proci^^s for the Aptitude Tesi. the 
writer must take into consideration a reviewer's comments, revise 
the question as necessary, submit the revised question fof a second 
review by another individual, again revise, and. if changes are 
substantiaf. submit the question for yet a third test speciafist's 
review. In certain cases, questions may be reviewed by an expert 
outside ETS who can bring a fresh perspective to review of the 
questtons^ 

Centr^f to the review process for the Advanced Tests is the com- 
mitlee of examiners for each test. After Advanced Test questions 
have been written, they are reviewed by the ETS test speciahst and 
then prepared in multiple copies for review by the committee 
members. Each committee member receives a collection of air the 
questions and forms on which to record reactions. Ordinarily* the 
committee members are asked to indicate the correct answer for 
each question, their rating of the importance of ;he question's sub- 
ject-matter content, their rating of its technical qualify, and any revi- 
sions or comments they deem appropriate. Probably the Single 
most important part of the review is their indication of the correct 
answer^ Any disagreements among the committee members re- 
garci\ng the correct answer clearly signal the presence of possib/y 

5 

i:; 



serrous flaws^ The ratings of question content and technical quality 
are also Important, and distinguishing between content and 
technical quarity is useful. Questions rated highly on both content 
and technica* quality are clearly the best, and those with high 
content and lowtechnictit quality ratings may well be worth revising 
to improve their technical quality. However, those with low content 
ratings are likefy candidates for discarding, regardless of thek 
technical quality. 

The next step is collation oi the independent reviews of the com- 
mittee members. A copy of the collated reviews is sent to each 
member. The sources of individual review comments are not 
identified. Thus, the committee members review the new questions 
and then have a chance to review the consolidated reviews of ttiQir 
follow committee members. 

The most significant activity at almost every committee meeting 
is thorough review of questions for a new test edition^ Most of the 
time and energy of the participants is devoted to this activity. 
Generally, the new questions are taken up one tty one. After dis- 
cussion, each question is either approved (often with substantial re^ 
vision), discarded, or held for possible revision or use in the future. 
Then decisions must be made as to which approved questions 
Should be used in the test lo provide a balanced coverage of all 
aspects of the test specifications and to avoid undue overlap/ 

It may be useful to think of three facets of question review for the 
Advanced Tests: review of subject matter, review of technical 
quality, and review ot editorial style. As a rule, the committee 
members and £rS test specialists are in a position to provide all 
three facets of the review, it is not uncommon^ however^ for com* 
mittee members io focus on subject matter. ETS specialists to 
concentrate on technical quality, and £TS editors, who review 
questions at a later stage, to concern themselves mamly with edi- 
torial style. 

Test Assembly 

After the items have been reviewed and revised, they are cufied to 
pro<}uco a group of ttie best questions, consistent with the 
specifications, for inclusion in a test, in the case of the Aptitude 
Testt the selected questions are assembled first in pretest form. The 
questions fudged best based on thetr performance in the pretest 
become part of a pooi of questions from which a final form of the 
test is assembled, tn the case of the Advanced Test, assembly of the 
final form typically begins in the committee meeting and is com- 
pleted by the test specialist with committee advice^ 

The test assembler considers not only the individual question but 
also the relationship of the question to the entire group of ques- 
tions in the test beJng prepared. For example, m preparing the Ap- 
titude Test, the test assembler makes sure that no two questions are 
actually asking the same thing in a set of reading comprehension 
questions and avoids in vocabulary questions the frequent reap^ 
pearance of words already in the test. Test assembly requires order- 
ing the questions, with very easy questions placed first, balancing 
the questions to meet test specifications for content and statisticat 
qualities, and recording information showing hOw closely the test 
matches the specifications. 

Test assembly includes attention even to such seemingly minor 
details as assuring that no preponderance of correct answers is 



associated with a particular letter for example). The following 
formula is used to check the balance of correct answer choices: 
N A'n 

* ■ where N is the number ot questions in the test and n is 
n Y n - I 

the number of options for each question. In a test of 160 five- 
answer^choice questions, the number of (E) correct answers must 

be: 

^1^./^^^ 32. 6=26 to 36. 

In a fotder with thf assembled questions (each on an individual 
card with information if available, on iXs statistical characteristics) 
is included several kinds of information : 1) the title, assembler^ pur- 
pose, and schedule of the examination: 2) the content characters* 
tics of the test; 3) the distribution (with means and stendard devia- 
tions computed) of questions according to estimated or known 
difficulty and discriminating power (for the entire test and. in the 
case of the Advanced Tests, for the equating subtest separately) : 4) 
specifications for equating and item analysis; 6) a recofd of pre* 
vious sourcest uses^ and statisticat characteristics of each ques- 
tion: and 6) an official key. The official key shows the correct 
answer for each question. This key can be certified as official and 
signed by the test assembler oniy if at least three independent 
experts have agreed on the correct answer for each question. 

When the ^ost has been assembled^ it is reviewed by a second test 
specialist. Then it is reviewed by the test development coordinator 
Atter mutually agreeable resolutions of any points raised in these 
reviews have been reached^ the test goes to a test editor^ The test 
editor's review is likely to resuit in many suggestions for change, 
and the test assembler must decide how these suggested changes 
will be handled. If a suggested change yields an improvement from 
an editorial viewpoint, without jeopardizing the content integrity, 
the . change is made. OthenA/ise. new wording is sought that witi 
meet the dual concerns of content integrity and editorial style^ After 
a second careful editorial review by a copyreader, camera-ready 
planograph copy is prepared by specialists in test typing, drafting, 
layout, and proofing. 

In the case of the Aptitude Test, the camera*ready copy is 
returned for reviews by the test assembler and by another test spe- 
cialist. The assembler and planograph reviewer check for any 
problems that may have been overlooked, AM reviewers except the 
editors, copyreaders. and proofreaders must attempt to answer 
each question without first checking the answer key. This means 
that each reviewer is "taking Ihe test" and is uninfJuenced by 
knowledge of what the question writer or test assembler felt the 
answer shoutd be. 

In the case of an Advanced Test, the camera-ready copy must be 
reviewed again by the committee of examiners. Photocopies of the 
camera-ready test are sent to each member of the committee of 
examiners. At this stage the committee members are asked to take 
the test and to mark the correctanswers to the questions. They note 
any changes they think need to be made to ensure accuracy and 
eiiminate ambiguity. On the basis of these reviev^ra. the test assem* 
bier specifies the final changes to be made. Special problems may 
require consultation with the committee chairman. 

After a final review for correspondence between directions and 
questions, question and page numbering^ and overall layouts the 
pianograph is sent to the printer under conditions designed to 
Protect the confidentiality of the test material. Review of a proof 
copy precedes printing. 



6 



Quality Control 

Test quality and the consistency of Quality across test editions are 
controlled largely through the extensive reviewing process dur ly 
which a rrumber of independent critics e^/aluaie each question a 
test for content, clarity, accuracy, and slyie However, two methods 
requiring statistical analysis of questions are of major importance 
in assuring that e frnai test of high quahty "S produced: Pretesting 
for the Aptitude Test and prelimmary item analysis— statistical 
analysts of individual questions m a final test form before scortng — 
tor the Advancec Tests A full test analysis that gives detailed in- 
formation on the test s reliability. $icore distributions, speededness. 
and other charactenstics is always provided to test development 
staff and committees after scores have been reported. However, the 
purpose of a full test ana/y5is ts to gyjtde development of future 
forms of the test iind to document the characteristics of a given 
form. The purpose of pretesting and Of preliminary item analysis is 
to assure that a given test form, by the time it is produced and 
scored, contains questions that are without serfous flaws 

Pretesttng requires inclusion of some questions m the test that 
do not contnbute to examinees scores but are experimentally 
"scored" for a representative sanip/e of the population to obtain in- 
formation on the difficulty and usefulness o1 the questions All 
questions contributing to anyone's GRE Aotitude Test 5cnres have 
been Pretested before inclusion m a fmai form of the test Pretest 
data are valuable because they enable test specialists to eliminate 
poor quesitons (perhaps revising and repretesting them) and to 
meet rather precisely the test specifications for difficulty and relia- 
bility E?(aminees are fnformed through the GRE information 
Buifetin and the GRE Sample AptitudeTest that such tnal questions 
are part of the examination and that they will not affect reported 
scores. After statistical analysis is completed on pretested ques- 
tions, information on the performance of each question is pasted 
on a card with the question Printed on the front. This assures that 
the information on the question will be readily available and con- 
vanient to use. The cafd ^s Then used m assembJmg a final form of a 
test (assuming that the performance of the question is satisfactory) 
The analysis on the back of the card provides information on the 
number ot people m the sample, difficulty level of the question 
(percent answering it correctly), number selecting each answer 
choiCe^ and mean ability level of those selecting each choice (mean 
ability level is defined in terms o1 performance on the appropriate 
abif/ty measure m rne actual test). Another t^tt of v;tai rnformat/on 
on the analysis card is the r-biseriaL that is. the question-rest cor- 
relation ^f students doing well on the test as a whole also do well on 
tt^e question, the correlatton will be relatively htgh. if not. ii will te 
relatively low if all the r-bisenais are very htgh, the test may be 
measuring a construct too limited or too narrowly defined If the 
r biserial of a question is very low. it rs not contnbuting to the reit- 
abftity of the test as a who/e The low r-brsenal suggests that there 
may be a problem inherent m the question or that students are un- 
far,>itiar with the materral or concept tested 

The GRE Advanced Tests have typtcatly been constructed 
Without pretesting. Even though Pretesting would permit develop- 
ment of forms mOre nearly parallel in difficulty, some differences 
among test forms m thrs respect are acceptable since one edition of 
a test fs states trcai/y equated to other ed;t;ons. Thus. si:udenr5 takiny 
a more difficult test would not have to answer as high a percentage 
of (Questions correctly as those taking an easier edition For a fuit 
year, Pretesting was ined for all the Advanced Tests The process 
proved to be of htiie value m improving tiie Ovci^!' rr.ftahiiitv of tl^o 



o 1 

ERIC 



tests, a. id sectioning the tests to allow for a separately timed pretest 
sechon caused administrative problems. Students who otherwise 
mtghr nave fmished the tests and left early expermnced restless- 
ness when they finished each indivfdual section early. 

For other reasons also, pretesting for the Advanced Tests did not 
prove particularly useful. First, it is easier for a committee of 
examiners m a field to estimate the difficulty and discriminating 
power of a question m that field 1)^3n it is for test specialists to esti^ 
mate the difficulty of an Aptitude Test question that will be used for 
a widely heterogerveous group erDbraclng all fretds of study. 
Second, each Advanced Test contains a number ot already tried 
questions (generally 20 percent of the test, sometimes mOfe) in- 
cluded for equating purposes. These questions have all proved to 
be of appropriate difficulty and high discriminating power. 

Preliminary item analysis is an important Procedure m controfling 
the quahty of the Advanced Tests Preiiminary item analysis is 
performed also for the Aptrtude Test, as an additional Check for 
such Problems as possible misprints^ but the previous Pretesting 
step makes the preliminary item analysis (ess important for the Ap* 
titude Test than for the Advanced Tests. Before a lest being 
administered for the first time is scored, a sample of answer sheets 
arnvihg relatively early is experimentally scored and analyzed, A 
question that reveals poor discriminating power, inordinate 
drffiCulty^ or a large number of omissions is reWewed again at th*5 
point by test specialists and committee members to make certain 
that the question is not ambiguous and that the answer designated 
as correct is indeed the only correct answer. If Problematical ques- 
tions are identified that escaped the attention of the committee or 
test specialists earlier, a decision can be made to eliminate the 
question from scoring or, possibly^ to permit two correct answers. 
Many Advanced Tests require no change at this stage: others may 
require action in the Case of one or a few questions. Because of the 
effectiveness of the pretesting process for the Aptitude Test^ a 
change in the scoring instructions is almost never needed. Ah 
though the methods of pretesting and preliminary item analysis dif. 
fer in their importance for the Aptitude Test and the Advanced 
Tests, these methods are vitat to the maintenance of quality in the 
Graduate Record ExamfnatrOrrs and are effect/ve in keeping rel^ 
abilities in the very high .80s or above. Pretesting and preliminary 
item analysjs are discussed m more detail m Chapter 5. 

Testing Standards 

The standards that apply to all GRE tests are summarized below^ 

T Tests used to assist in makmg decisions that are typicafty irre- 
vocable and have significant impact on students' courses of ac* 
tjon Should have reliabilities that do not fail below the upper .80s 
or low .90s. Tests with lower reliabilities can be provided for 
such purposes as self-evaluation or counseling. 

2 All scores used to assist in making significant decisions should 
be sufficiently distinct to warrant separate reporting (for 
example, score mtercorrelations below 80 when reliabilities are 
in the 90s) 

3 The measures should provide a distribution of scores that ap- 
proximates the normal Curve. 

4 The tests should not be highly speeded. 

5 The tests should have appropriate content for the constructs 
they are designed to measure and should be positively corre- 
lated with successful performance in grad^iaie schoot 

7 



6, Sufficient information should be provided to users to permit ap- 
propriate interpretation of scores. 



References 

Educational Testing Service, £TS Procedures for dBterminmg the 
i^a/Ztf/ry of questionabte scores. Princeton. nJ.: Educational Test- 
ing Service. 1975, 



Educational Testing Service. GftE 1977- 73 tniormiitton Buttetin. 
National Administrations Edition. Princeton. N.J.: Educattonai 
Testing Servicen 1977 (published annually}. 

Educational Testing Service, GflE Samp/e APtiwae Test, second 
edition. Princeton^ NJ.; Educational Testing Service. 1977. 

Educational Testing Service, Guia^ to the Use of the Graauaie 
ftecorct Examinattons, 1977-76. Princeton. N,J" Educational 
Testing Service. 1977. 



6 



Chapters 

DEVELOPMENT OF THE APTITUDE TEST 



The GRE Aptitude TesI is a standardized test of general academic 
ability. It includes three measures: verbat ability, quantitative abilily^ 
and a newly added analytical ability measu^^e The Aptitude Test is 
intended to reflect skills that have developed over a long period of 
time. Although it assumes exposure to a predOminantJy Engi^sh- 
speaking culture and to the educational practices of the United 
States, the test is designed to be as appropriate as possible lor 
potential graduate students with diverse backgrounds and 
interests. 

The purpose of the test is to contribute to prediction of a 
student's performar^ce in graduate schooL Mot only is iX based on 
constructs (hat are theoretically related to successful study in a va- 
riety of lields. but performance on the test has been demonstrated 
to be positively correlated with performance in graduate and under- 
graduate school as measured by various criteria. The Aptitude Test 
is not intended to measure inherent intellectual capacity or in- 
teMigencen nor is it intended to measure personality traUs or social 
worth^ Its limited purpose is to taP the ability to reason with words^ 
mathematical concepts^ ar^d other abstractions to arrive at a soiu' 
tion to a problem. Such factors as knowledge of words and 
mathematical concepts and practice in reading and fundamental 
quantitative operations wiJ^. of course, define the Jim^ts within 
which one can reason using these tools. 

The rationale for the content of the Aptitude Test originates tn the 
need lor highly developed fundamental skills m graduate study ol 
any kfnd^ Three scores rather than a single score are provided for 
several reasons: 1) a multidimensional definition of academic talent 
wilt best serve institutions and students in a variety of fields^ 2) the 
three scores— ve*'baf abiiity. quar^titative ability, and analytical 
ability^are sufficiently independent to be Providing comple- 
mentary information about students; and 3) studies suggest that 
each score is related to academic perlormance in differing degrees 
depending on the field and may differentiailv improve Prediction of 
graduate i^cnool success 

Evolution of the Aptitude Test 

The development and evolution oi the Aptitude Test have been de- 
termined by perceptions of the needs of the students and institu- 
tions making up the graduate community and by high standards of 
psychometric quality These perceived needs and established stan- 
dards — and the fact that they are sometimes conflict— are 
reflected in published materials concerning the Aptitude Tes; in 
various stages of its evolution. For example, the GRE Geneva/ 
BuffBtin. No. 2, in 1946 noted that ' further breakdowns of the verbal 
ability score ^re anticipated i\ analyses of the test results show 
them to yield satisfactorily reliable and differentiating part scores' 
(p. 3). Although part scores were perceived as potentially valuable 
to the graduate community, it was discovered that the various kinds 
of verbal questions were so highly intercorrelated that part scores 
could not be defended as psychometricaiiy sound. 

Throughout its hisloryn the Aptitude Test has been considered to 
be relatively independent of passing trends in student interests and 
teaching methods. Because the pfirnary advarttage ot a stan- 
dardized test is its capacity to permit comparisori of students by th<? 



sarne standards^ only two kinds of change have thus far been in- 
troduced into the Aptitude Test: 1) minor change in ^specifications 
of content expected to resuU in a measure more appropriate for the 
population without compromising parallelism of test forms and 
comparab^Jfty of scores over the years: and 2) change that wouJd 
increase the usefulness of the test without subtracting advantages 
already offered. In all instances where change has been suggested, 
results have been analyzed statistically to determine the possible 
elfects. 

Two recent examples illustrate Ihe kinds of minor change 
generally acccmmodated in the Aptitude Testr typical reasons for 
Such minor change, and the results of the change as demonstrated 
by statistical analysis^ Because of current ptans in the United States 
to introduce the metric system^ some ol the terminology used in the 
quantitative measure of the Aptitude Test has been altered to reftect 
changes taking piece in the educational system. Some questions 
previously referring to feet and inches^ for example, may now refer 
to meters and centimeters. However, it is not required that students 
know the number of inches \n a meter or the number of quarts in a 
liter. The numbers and computation have not changed^ but the 
terms may have. Statistical analysis has shown that changes in the 
terminology in the questions have not, on the average, affected 
their difficulty or their usefulness i'i distinguishing between high 
and low scorers. Until the new system of units and measurement 
has become firmiy entrenched, the quantitative measure 'i^ the Ap- 
titude Test will not require knowledge of its fundamentals. 

Recently, the specifications of the verbal ability measure of the 
Aptitude Test were relined to reMect social change thought to have 
resulted in a different mix of reading materiaJs in the average 
student's experience. Diversification of passages in the reading 
comprehension section had previously been assured t>y concern 
for balance among humanitieSn social stud^esn and science 
passages and inclusion of various styles, such as fiction and argu* 
mentation. The refinement in specifications added a requirement 
for one Passage relating to minority concerns and one relating to 
the concerns of women. The purpose of this refinement was to 
increase the appropriateness of the content for the heterogeneous 
population and to increase the resemblance of the reading selec- 
tions in the test to materiaJs availabJe to the typical student. Statis- 
tical analysis of the passages with content related to minorities 
showed that they were not significantly dilferent from other 
passages in the same general categories— that is^ in the hu- 
manitiesn social studies^ and snience^for the total population. 

The feasibility of making major changes in the tesi (such as the 
addition of new measures) to increase its usefulness has been 
irwestigated periodically^ In the early l95Gs. 3. number of potentiaNy 
useful types of questions were tried out in experimental sections 
the GRE Aptitude Test: for example, questions designed to test the 
"ability to reason logically tn terms of abstract figures/^ as the di- 
rections in a 1951 pretest suggested; the ability to interpret data or 
to judge the sufliciency of data: the ability to ' integrate" material in 
an essentially artificial language with more ruiesn greater com- 
plexity, and less dependence on knowiedpe of grammar than other 
such tests, the abiMty to induce rules in such tasks as completing 
analogies, completing a series of symbols or concepts, and select- 



ERIC 



i" 



ing an incompautie term or Symbol n an otherwise togtcally related 
series; the ability to jud^e evidence (these questions resemble e 
Type of question investigated mo^e rec6r>ttyp Evaluation of Evi- 
dehce-^see Appendix I), and even a rton-mulliple*chotce tyPe o' 
question involving categorization of words m hGts. These experi- 
mental efforts, tiowever. did not lead to expansion of ttie test. 

In the late 1960s and early 1970s, such possible measures as 
Spatial Vl^uaNzation. History of Ideas. Writing Skills. High Levet 
Math Usage, and Logical Reasoning were examined with a view ot 
permitting stu<ients to select optional measures based on their 
iniended specialization. At ihat time^ the tesis judged by a sample 
of faculty members on the Advanced Test committees of examiners 
to be most important and potentially useful were the Logical Rea^ 
soning Tes? (assumed to be selected as an option by humanities, 
social science, and some natural scmnce students} and the High 
Level Math Usage Test (assumed to be selected as an option by 
some science students). Despite reliability and promise of validity, 
scores on the Logical Reasoning Test were too highly corretated 
with verbal scores to be considered a valuable adiunct to the 
origmal Aptitude Test. Although the High Level Math Usage Test 
was found 10 be appropriate for ^ts mtended use* its introdtictmn 
depended on shortenmg the part of the quantitative measure com- 
mon to aM students to only 30 minutes. Such a reduction in time 
was considered to be unwise because it limited the common 
measure s diversity However, the High Level Math Usage Test was 
incorporatecJ tnto tbe Advanced Engineering Test, a measure in 
which it appeared particularly uselul. 

in 1974. a new eifort was initiated to consider possibie improve- 
ment o1 the Aptitude Test by broadening its delinition of academic 
talent That effort is continuing and includes investigation of 
methods of testing for scientific creativity and cognitive sty^e. 
Results ot research suggested that the verbal and quantitative por- 
tions ot the test, as it exisled before 1977-78. couJd be shortened 
without reducing reliability below a satisfactory level and without 
affecting the comparability o1 past scores with scores based on the 
shortened versions This research effort also yielded information on 
a variety Of tests of various aspects of reasoning. A subset of those 
tests was selected to form a new measure of analytical skills. The 
1977-78 Aptitude T«5f differs dramatically irom the tesJ that pre- 
ceded it. However, that difference represents an added value to the 
test and rr^amtains the importance of the tradifior^al verbal and 
quantitative measures The diagram al the rignt illustrates the dif- 
ference between the two Tests m sr.ores yielded and in basiC 
con ten r 

General Format 

The restructured Aptitude Test consists o1 five separately timed 
sections, two of which are 50 minutes long and three o1 which are 
25 minutes long The verbal rneasore consists of 80 questions: the 
quantitative measure consists of 55 questrons; and the analytical 
measure consists of 70 questions Equal amounts of time— 50 
minutes each— are devoted to the three measures Twenty-five of 
the 175 minutes of the students time is spent answering trial queS' 
tions 

Although not contrib!>tfng to the scores of me students who lake 
thern. trial questions are considered an integral part of the exami- 
H-^ation. essential ro maintaining the high quality of the tcst^ Unless 
the Thai Questions can be given to a sample of the regular GRE 
Population unde"' normal standardized cOndttions. the statistical 
data most dependable in making parallel lorms of the test cannot 

10 



ORIGINAL APTITUDE RESTRUCTURED APTITUDE 

TEST TEST 

(Before October 1977) 





VERBAL ABILITY 




VERBAL ABILITY 






Discrete 




Discrete 






Verbal Questions 




Verbal Questions 






(25mln.J 




(25mln.) 






Reading 




Reading 






Comp r€hension 




ComF> rehenston 






(all topics) 




(all topics) 






(50min.) 




<25min,) 














QUANTITATIVE ABILITY 




QUANTITATIVE ABILITY 


QUESTIONS 




QUESTIONS 


(75 mrn.) 




(50mif>.) 



ANALYTICAL ABILITY 
QUESTIONS 

(50 mm.) 



be obtained. The Irial questions represenl research that directly 
benefits the students who take the tests- and students are. of 
course, informed in the GRE foformalioo Buifetin of the inclusion of 
thai questions In the test In addition, the test supervisor* just 
before the examination is administered* reiterates that "each edi- 
tion of the Aptitude Test containsa number of questions being tried 
ou1 or pretested for possibfe use in future editions of the test. 
Therefore* you may not have the same tesi book as your neighbor. 
Answers to these trial questions wMl not be counted ;n your scores ' 
(GRE National Administrations. 1977-78 Scvpemso/'s f^anaai, p. 
15), Taking Ihe test is considered to be acceptance of or consent to 
that situation. 

Content Characteristics 

The content of each edition of the Aptitude Test is determined by 
concern for appropriateness to the population and comparability 
with past editions. Appropriateness of the material is assured by rn« 
elusion pf diversified content, use ^f a variety of kinds of questions, 
and selection of nontechnical material of the sort likely to have 
been encountered by students planning graduate study. Compa^ 
rability or stability is maintained by constant requirements for simi- 
larity ir> content and statistical charactertstrcs in all editions of the 
test. 

In the following <Jiscussion of the Aptitude Test, the part of each 
question that poses the problem will be referred to as the 'stem,' 
the answer choices as "options," the wrong choices as "distrac- 
tors," and the fight choice as the " correct response/' 

Verbal Ability Measure 

The verbal ability measure, designed to test the ability to under- 
stand and manipulate written words trt Order to solve problems, 
consists of four question types antonyms^ analoQieSn sentence 

1 :.; 



o 

ERIC 



completions (discrete (Questions, so called because each queshon 
>3 independent- sharing no common stimulus material), and read' 
mQ comprehension sets. Discrete questfons are drawn from four 
areas of human interest: 1) the arts and humanities^ 2) the social 
eludies and concerns of practical or everyday life. 3) the world of 
science* 3nd nature, and 4} tne domairv of human reJaJ^ons^ips ar»d 
feelings. Reading passages may be drawn from the humanities, 
social 5:cienceSi and natural sciences and may represont a narrative 
as well as a discursive style. 

Equat amounts of time are devoted to discrete questions and sets 
of reading comprehension questions. Fifty^five discrete questions 
can be administered m 25 ,T»inutes and 25 reading comprehension 
questions in 25 minutes. Discrete que^^tions are nota&le for their 
efficiency {contributing high reliability for the amount of tin^a in* 
vested), and reading comprehension questions are distinguished 
by the close link they provide between the lest and the actual read- 
ing activities ot graduate students. 

Anlonymi, Antonym questions provide the least context An 
isolated word or phrase is presented m the stem; the options 
consist of possible antonyms to the stem. Distractors may be 
chosen on the basis of their simiiarity in sound or spelling to other 
wordSn but synonyms are avoided as d»stractors, since th*>y may 
prGve more tricky than challenging to students The purpose of 
antonym questions is to measurs not only knowledge of words but 
also the ability to reason from a positive to a negative concept, to 
leap conceptually from one extreme to another. Word frequency 
lists are not generally used in selecting words to tie used in 
antonym questions because the pretesting process provides the 
best indication of the familiarity of the population with the word. 
The djf^culty of an antonym question may reflect the frequency of 
appearance of (he words m speech anj writing as wt^U as the attrac' 
tiveness oi the options 

Antonyms may require only rather general knowledge of a word 
(see the first example beiow). or Ihey may require rhat a student 
make tine distmctions (see the second example). They may aPpear 
as Single words or as phrases and may be ^lny part of speech The 
directions for antonym questions and three examPies appear 
beiow. An asterisk denotes the correct response 

Dir^ctton^: Each queitlon below consists of a word printed in 
capital letters, fallowed by fiv« words or phrases lettered A 
through E, Choose the letteted word or phrase that is most nearly 
opposite in meiining to the word In capital letters. Since some of 
the Cftjesliorts require you to histinguish fine shades of meanfng, 
be sure to consider all the choices before deciding which one is 
best. 

1, CONSCRIPT: (A) mediator "{0} vofunteer {O eccentric 
(D)comed^an (E) villain 

2, MURKY: (AJ clamorous {B) complex *iC) fufl of lEghl 
(D) endowed with b«auty (E) free from error 

3, PROMULGATE; (A) distort (B) demote '^C) suppress 
(D) retard (E) discourage 

Analogies, Analogy questions provide somewhat more context 
than antonyms and require the student to recognize parallel rela, 
tionsh)ps The two words tn the stem are separated by a colon suQ' 
geshng rhat they share a relationship Each of the options presents 
a pf^ir of words, ^gafn separated by a colon to suggest a reiat)on- 



ERIC 



ship between them. The relationship may be kind, size^ conti' 
gu^ty^ or degree. Analogies may be classified as independent or 
overfapping. 

In an independent analogy, neither of the words comprising tho 
correct response is sim^ilar in meaning lo a word in the stem (for 
example. C0LD:CDNGEALMEn1 ::heal:incandescence. where co^d 
and ^eaf. through extremes of the same continuum, are no more 
similar in meanmg than congeafment and /ncandesce/rce)^ In an 
overlapping analogyr one of both of the words in the correct 
response is suggestive of the meaning of one or both ot the words 
in the stem (f or example^ METAL;DROSS;:wheat:chaffH where "^ross 
and chafi both signify a waste product). There are more inde- 
pendent than overlapping analogies, and. even in overlapping 
anaiogiesn dependence on word associations alone for the solution 
is avoided, often by inclusion of distractors with similar asaocrfl' 
tions The purpose o^ the analogy is to test the ability to recognize 
parallel rather than loosely related word pairs. Analogies may also 
be based on words with only concrete referents {sea the first 
example t>elow), with only abstract retei^ants (see the second 
example), or with both kinds of words (see the third example). The 
directions for analogy questrons and three exanripies follow. An 
asterisk denotes the correct response, 

Dfmctions: In each of the foflowing questionSi a related pair of 
words or phrases Is followed by five lattered pairs of words or 
phrasee. Select the lertered pair which best expresses a relallon- 
shlp similar to that expressed In Ihe original pair, 

FROND:FERN:: {A)ecOrn:oak (S) bulb:tullp 

*(C) needle:plne (D)de8ert:C8ctus (E) foliage :b[ossom 

2, OaEDlENT:OBSEOUIOUS:: (A) ludlcrOus:rld]culous 
*{B) helpful: ortlclous fC) unuauat:obvloua 

(D) happy:zealous (E) serene: agitated 

3, JUMTA:POLITlCAL::(A)team:succe»sful 
*(B) ceuncll:advlsory (C)|ury:secretlve 

(D) catalogue :arbttrary (£) parent:lnstructlve 

Sentence Compfettons, The third discrete question type, sentence 
completions. Provides increased context and is ctosely related to 
rear:!ing comprehension. Sentences are usually selected from read^ 
ing materials that mighi be commonly available to students. They 
contain one blank or two, and students are required to select the 
completion that is logically and stylistically consistent with the rest 
of the sentence. 

The four examples of sentence completions illustrate, in order, 
the four areas of human interest to which an antonym, analogy^ or 
sentence completion question may be related^ 1) the arts and hu* 
manities. 2) the social studies and concerns of practical or everyday 
life^ 3} the world of science and nature, and ^) the domain of human 
relationships and feelings. An astensk denotes the correct 
resPorise, 

O^rectiofis: Each of the sentences befow has one or more blank 
spaces, each b^ank indfcaMng that a word has been omfrted. 
Beneath the sentence are five lettered words or sets of words. 
You are to choose the one word or set of words which, when 
inserted in the sentence, best fHa In with the meaning of the 
sentence as ^ whole, 

11 

1:.} 



1^ Som* tim« «90trAnalator> rvaiizvd that they must the idea 

that tn ftndent citttlc. tlmply bectute it j» ancJent> must be 
rand«r«d In the archaic Etigiiah of anotharara. 

<A) «jctract (B) absolve (C) maintain (d) Perpetuate 
*(E) rallnqulah 

2. Some people argue that the 9>*owth or industraai research has 

been too rapidi that In tome companies research which 

It supported because the associated with it rather tt^an 

because the real benefits derived. 

*<A) a fad,.glamour <B) s luxury., prof if 
<C) a necesslty,.satiafac41on (D) an obstacle. .prestige 
(E) anlnnovetion,,9tab4Mty 

3. When a new comat appeared in I577i Ifa path straight throtjgh 

what were supposed to be the spheres that formed the 

skies the view that these spheres did not exist. 

(A) soJfd..ptmctured (B) vacfirTt..dJspened 
'(C) Impenetrabte.. encouraged <D) Invlslbte.. exploded 
(E) perforated ..corroborated 

4. She was saddened to hear that her coHeagues continued to 

her protege* for she had hoped that success would 

htm. 

(A^ patronlze..enrage *(B) dlsparage.^vindlcate 
(C) underwrlte..attrac1 (D) fratter..erTcourage 
(E) derida..huniiitate 

Roading Comprehension Sets. Reading comprehension passages 
are ol varying ler>gths. In each edition of the test, there are two rela- 
tivety l^ng Passages, each providing the basis for answering seven 
or eight questions, and three relatively short passages, each provide 
i^g the basis f*^*" answering three or four questions. Test forms are 
comparable in terms of the total number of words in passages ir> 
the reading comprehension sectfOn^ 

Although the mean d^fficuJty of the questions rhemselves for the 
examinees is considered the best index of the difficulty of the read- 
ing comprehension section of the test, an attempt is made to 
achieve an appropriate range and variation of levels of difficulty of 
the reading material in a special analysis, applying the Simple Test 
Approach for Readability (STAR) developed by General Motors* an 
average of I4.3grade-Jevel equivalency was obtained for two recent 
GRE verbai forms. This grade-level equivalent suggests that the 
overall reading levet was not difficult for college graduates. The 
gradO'levei equivalency of passages ranged from 10.1 to 21^2 m this 
analysis, and the correlation between mean question difficulty and 
the difftculty of the passage on which the questions are based was 
onty .22, It is not surprising tha! the difficulty of <luestions has a low 
reiattonship to ihe difficulty of ihe passage associated with those 
questions. A question's difficulty depends on a number of factors 
such as the attractiveness of ihe distraclors and the type of reading 
skill being tested. 

The six example questions illustrate, m order ihe six major types 
Of reading comprehension questions that appear in the test. These 
types focus on 1) the main idea or primary purpose of the passage: 
2) information explicitly stated in Ihe passage; 3) information or 
ideas implied or suggested by the author: 4) possible application of 
the author's ideas to other situations: 5) the author s logic, reason- 
ing, or persuasive techniques: and 6) [he tone of the passage or the 
author s attitude as it is revealed in the language used. An asterisk 
denotes the correct response 



DirecUons: Each passage \n thu group Is followed by quastlons 
baaed on tts content. After reading a passagei choose the best 
answer to each question and blacken the corresponding space 
on the answer sheet. Answer all questions following a passage 
on the basis otwhaits at sfed orJmp/Jed in that passage. 

The literary generation that crusaded against Puritanism and 
the genteel tradtttoni against atareotypes and senttmeotalltyi 
also saw what Merle Curtl terms "the beginnings of a nsw snd 
realistic Interest In American regions and American folk.*' The 
trend toward twentlaty-century realism exemplified In this country 
by the works of Sherwood Andertoni Sinclair L«wlSi and Theo* 
dore Drelsar was paralleled In the work of black writers of ths 
same period. Rmlolph Fisher, ArnaBontempSi and Jessie Fsuseti 
for exampfei reached an almost full scale of seU*raveiatton and a 
substantlaf degree of self*critlclsm. By breaking with past literary 
tredttioni btsck writers tn the 19^0's were developing s greater so* 
phtstl cation of style and wider and more universal appeal, 

American art faced a problem <n the aarfy twentles-^e probfem 
bom of the fact that for years the whJte American ertliU had 
regarded the American srt scene as unaophlstlcatedf an.i the 
black artist had fett oppressed by the social situation. Frequently 
escaping to life abroad where new developments In art weni tak* 
tng placSi neither contributed much toward the development of a 
distinctive American art. In the mldtwentleSi the sams forces that 
Inspired the upsurge of new. more reallstlci -f^d unepologstlc 
talent tn the other arts Inspired changes In the atutude of artists. 
In addition, of course, the rising tide of modernism tn Europe and 
at home encouraged young black painters to turn away from 
traditionalism In both sublect matter and style. 

Fortunately, as with the parallel movement in fitsrature. this 
movement tn painting did not lead the black artists Into racialist 
art. On the contraryi It led them Into the mainstream, American 
artists were beginning to develop Negro themes and sublects as 
new native American materlat The older white artists had han* 
died the Negro themes In a somewhat casual end superficial 
manner. For many young whlta artists of the twentlesi blacks 
were the subjoct of weful and penetrating Interpretation. The 
fact that young whtte^^^torlcan artists and their young black 
contemporaries shared tm^ new Interest In black life was 
significant, A common ground was established among young 
artists. The notion of the black world as a restricted province to 
which the black artist was confined was removed. At the same 
tlmoi the black artist was challenged by the task of setf -revelation 
and forced to attempt It In competition with other artists. The 
poise and originality of the young artists of that period and their 
honest depiction of American life brought tham closer to the 
realization that race was a medium of expression, not an end In It* 
setl. F6r though their work was avowedly racial for the most part, 
they ranged with an Increasing sense of freedom through the 
unlverseofs common human art. The strength and vigor of srtlsts 
like Aaron Douglas. Palmer Hayden, and Hale woodruff were a 
reflection of superior advantages and training, Df equal. If not 
groateri Importance was the fact that their spiritual enlargement 
stemmed from the growing conception of American culture ^1* 
tally and necessarily Including the materials of black life. 



1. Th« 9Ulh0r'i prim&ry pufpoi« \n Ih* paii«ge Is to 

(A) •numerate several dilemma* feced by black artists lr> 

America 

(B) explain the differences between realism lr> literature end 

reallsminpalntlrg 

(C) contrast the wcrks ot black artists with those ol their white 

contemporaries 

*(Df analyze the effect en black artiatft ol the movement toward 
realism in art 

it) encourage the incluelon ot black lite in artistic depictions 
ot American culture 

2* The author mentlone Sinclair Lewis and Jessie Fauaet as 
e;«amples of vvrlters who 

(A) awakened European interest In American culture 
*tB> broke away from past JJterary traditions 

(C) portrayed the lives ol blacks reallettcaHy 

(D) were among the most prolific wrhersoMhe 1320 s 

(C) Influenced artists In fields other than literature 

3> tt can be inferred thati In the early decades of the twentieth 
centijryi many American painters wont abroad because they 

(A) hoped to redress aocfaM^Juilfcss (n America 
IB) disliked the trend toward moderr^lsm In America 
''fC) regarded Europe as the place where new developments In 
art were taking place 

(D) wished to encourage Europeans to )oln the movement 

toward realism 
it) Could lind no way to suF^ort themselves In America 

4> Tho etatements In the passage suggest that the author would 
most Mkely react to a movement among black artists toward 
racialist art with 

(A) amused cynicism 

(B) deliberate Indifference 

(C) anthusiastic encouragement 

(D) cautious optimism 
disappointed disapproval 

5. The author quotes Merle Curtiin order to 

*(A) support his own enalysis ot a trend 

(B) Indicate the ambiguity ot his topic 

(C) provide a contrasting viewpoint 

(D) toreshedow new directions ettltudes may take 
(C) lltustrate pest resentment to change 

6> The tone and content of f^^ passage suggest that Its source 
wes moat likely 

(A) a guidebook to a collection of paintings by black artists 
*(B) an essay on the dev/elopment of a characteristic American 
style by black and white artists 

(C) an editorial on ghetto lite as experienced by artists during 

the t920*s 

(D) a book on the way art reflects public opinion as 

eKemplllled by the trend toward realism 
it) e biography ot a famous black American artist who llv/es In 
Europe 



Content Specifications for the Verbal Ability Measure 

Conii^ni spectfications Or statements of required numbers arid 
kinds ot nonter^t for each test form assure the parallelism of all 
forms. The tab/es below and on page t4 snow -he breakdown of 
conlent for the current verbal ability measure. 

As Tables ^ 2, and 3 jJlustraten balance of diversified maierrals c 
a primary consideration io make the test appropriat&forthe various 
segments ot the populatron. ForeKdmpie. since Coff man ( 1965) has 
stiown that men tend to do slightly better on discrete questions 
classified as belonging to the world of science and nature end to 
the domain of social studies and concerns of practical or everyday 
ute whereas women tend to do slightly better on the arts and hu- 
manities (aesthetic, philosophical) and human relationships ques- 
tions, balance among those classes of questions should result i" a 
test appropriate for botti senes. The rationale for balancing content 
throughout the test is an eKtension of these observations. The 
greater the variety ot materfaf provided* the more fikefy it is that the 
diverse population witi t:ie well served, assuming that the material is 
generally zccessible (nontechnical). 

Table 1 : Specifications for Discrete 
Verbal Questions 
(55 questions) 



Conttnt Qutttiofit 

Antonyms 

Arts and Humanities 5 

Social Studies and Practical 

or Everyday Life ... 5 

Science and Nature. . . ,5 

Human Refationships and Feefmgs 5 

General definitions . . 8-14 

Fine distinctions . , . 6*12 

Single words 10-16 

Phrases .... 4-10 

Verb 3-9 

Noun 3-9 

Adjective 5*11 
Other parts of speech 

Analogies 

Arts and Humanrties 4-6 
Social Studies and Practrcal 

or Everyday Life 4-6 

Science and Nature 4-6 

Huf>idn Relationships and feelings 4-6 

Concrete 4-8 

MiKed . 541 

Abstract 4 8 

Independent 1 1-15 

Overlapping 5-9 

Sentence Completions 

Arts and Humanitjes 4 

Social Studies and Practical 

or Everyday Life 5 

Science and Nature ,4 

Human Relationships and Feelings 4 

One brank 5-9 

Two btanks . . 11-15 



13 



Table 2: Specifications for 
Reading Comprehension Passages 
(5 passages) 





Pliimb«r vf f mif*t 


Sublet M«tt»r 




fvUI 




450 Mirdi 




Humanities 


I 




1 


Social Studies 




2 


2 


Natural Science 


1 




1 


Other 




1 


I 


Total 


2 


3 


5 



Table 3: Specifications for 
Reading Comprehension Questions 
(25 questions) 





Numb*' 




QUAttlVfIt 


Mam Idea 


3-5 


ExOltcit Statement 


58 


Inference 


6-7 


Apphcatton 


2-3 


Logic 


2-3 


Tone 


1-2 


Total 


25 



Quantitative Ability Measure 

The quantitative ability sectiort is designed to test basic 
mathematical skilts, anderstanding of erementary mathematical 
concepts, and abihty to reason quantitatively and to solve problems 
in a quantitative settir^g. This section consists of three question 
types: discrete mathematics, data interpfefalion. and quantitative 
comparison. 

Each discrete mathematics queston contains all *^e information 
needed for answering the question, except for the basic 
mathematical knowledge assumed to be common to the back- 
grounds of atl students. Many of these queslions, such as examples 
1 and 2. require little more than manipuJation and very basic 
knowledge: others, such as examples 3 and 4. require the student 
to read, understarid^ and solve a problem that involves either an 
actuaf or an abstract situation 

The data interpretation questions, tike the reading compreheo- 
siCfn questions in the verbal measure, usuaMy appear in sets based 
on stimulus material that precedes the questions. The stimulus ma- 
teriaJ for these questions consists of data presented in graphs or ta- 
bles. Data interpretation questions are designed to test the ability to 
synthesize information, to select the appropriate data for answerir^g 
a question, as in example 5. or to determine that sufficient informa- 
tion for answering a question is not given, as in examples. 

Directions for discrete mathematics and data interpretation ques- 
hons and examples of each type follow. An asterisk denotes the 
correct response. 

Directions: Solve each of the following problems, using any 
available space on the page (or scratch work. Then Indicate the 
beet enewer In the appropriate space on the anSwer sheet. 



Note: FJgures which eccompeny these problems are Intended to 
provide Information useful In solving the problems. They are 
drawn as accurately as possible EXCEPT when It is stated In a 
specific problem that Its figure Is not drawn to scale. All flexures lie 
In a plane unless otherwise Indicated. 

Alt numbers used are reel numbers. 
3 



1. If 



then 1 - x 



(A) 



(C) 




2. If A represents the area of AORS above, then 2A - 
(A) 4 {B>6 '(0)12 (0)24 (E) 36 

3. After an Initial deposit of x dollarSt the amount of money In a 
certain fund Is doubled at the end of each month for 5 months. 
If at the end of the 5-month period therelsatotalof $560ln the 
fundf how much money was In the fund at the beginning of the 
third month? 

(A>$17.50 (B)$35 '(C)$70 (D)$140 (E) $224 

4. Suppose that Q stands for a binary operation which adds the 
reciprocals of the two numbers It operates on. Forexample^ 



Which of the following statements Is (are) true for ell positive 
a,b? 

I. ^9: =a +b 



a b 

II. a@b 



1 

a + b 
1 ab 
a @ L a + b 

(A) t only (d) 31 only (C) III only (D) I and 11 only 
*(E) t and rri only 



III. 



14 



ERLC 



Qu^nUont 5-6 r^fer toth« following graphs. 



FRINGE BF.NEFIT F'AYMFNTS BY TVf'E. 
r955 AND 19()5. All iNDUSTRJfS 

:i9.i<f Per 7K5(t f^er 

Payroll Hatjf = lOOC^ [\ivrf)ll Hour - \Oiy} 




1965 

Type A. Paid vacdKronSj holidi^ys. sick leave, eit. 

Type B. Prrvaie (;^nsion and welfare tund contrihurton^j 

severaoice pay, etc. 
Type C. Legally required paymenls (social securtly, tnt i 
Type 0. Paid rest periods, national guard, jury duly. ert. 
Type Profit-sharing payrnenls, other bonuses. 

5. What wai tfia apprOxImato Jncraaio In cenu per payroll hour 
from 19SSto 196Sfortype Ctrlngo benatlt payments? 

(A) (B) ^M (C) 2.3< *(0} 7-3< (E) 9.2< 

S. It tringe bonafit paymonta averaged 25 parcent of the totat 
payrOtl In 196S* than fringe benefit paymenta averaged ap- 
f>rOxlmately what percent of the total payroti In 1d55? 

(A) 10% <B)14% (C)25% (0)46% 

*(E) It cannot be determined from the Information given. 

The third question type, quantftafjve comparisons, was 
in the GRE Aptitude Test for the first time in the 1977-78 year, al- 
though varrations of this type of question have bi^en used for a 
number of years in other testing programs. Quantitative com- 
parisons are characterized by a fixed set of four options and are the 
least time consuming of the three types of questions in this section; 
tfiedata interpretation questions are the most lime consuming. The 
efficiency of quantitative comparisons was one factor permitting 
restructuring of the Aptitude Test to include a new measure- Since 
performance of quantitative comparisons correlated so highly {ap- 
proximately .90) with performance on other types of quantitative 
questior^s used in the test before restructuring, it was possible to 
reduce the testing time without reducing the number of questions. 
Some of the more time-consuming questions were replaced by 
quantitative comparisons, with the expectation that the high reiia- 
bi/ity of the test ar>d the comparab/Jity of scores would be main- 
tained. 

Quantitative comparisons are designed to test the ability to 
reason quickly and accurately about the relative sizes of two quan- 
tities or to perceive that not enough information is gtven to mane 
such ^ decision. Some questions, as in example 1 at the right, only 
require some manipulation to determine which of the quantities is 
greater— the one if> Column A or the one m Column 6 Other ques- 
tions require the student to reason more or to think of speciaJ cases 
in which the relative sizes of the quantities reverse, or. as in 
example 2. to visualize other possible ways in which a figure couJd 
be drawn within the ground rules ^o^ figures given in the directions 



Ofrectlona; Each of the following queetlon^ consists of two quan- 
tlllee* one In Column A and one In Column B. Your are to compare 
the two quantities end on the eniwer sheet blacken space 

A If the quantity In Column Ale the greater; 

B If the quantity In Column B la the greater: 

C If thequantltlea are equal; 

0 If the relationship cannot be determined from the Information 
given. 

Common 

fnforrrtaflon; In a queetlonf Information concerning one or bolh 
of the quantllles to be compared Is centered 
above the two columns. A symbol that appears In 
both columns represents the same thing In 
Column A as it does In Column B. 

Numb&n: All numbers used are rerl numbers. 
F/gures; Position of points* angles* regions* etc. can be 
assumed to be In the order, shown. 

Lines shown as straight can be assumed to be 
straight. 

Figures are assumed to lie In the plerw untese 
oth erwl s e I n dl ce te d. 

Figures which accompany questions are intended 
to provide Information useful in answering the 
questions. However, unless e note states that a 
figure Is drawn to scale, you should solve these 
problems NOT by estlmatlfig sizes by sight or by 
measurement* but by using your krwwiedge of 
mathematics (see eKamp'e2 below). 



CoKimn A Column B Sempfe A/i«»ver« 



Example 1: 2x6 2+6 ^ <S> <D <S> 



Examples 2-4 
refer to APQR. 




P N Q 



Exempfe2: PN NQ <S> <S> <S> ^ 

(since equal mea- 
sures cannot be os 
sumedi even 
though PN and 
NQ appear equal) 

Examp/e:?: x y <S> ^ <S> <S> 

(since N Is 
between P and Q) 

Examp/e4; w + z 180 <S> <S) • ® 

(since PQ is a 
straight line) 



Content Speclttcatlons tor the 
Quantitative Ability Measure 

Content specifications for the quantrtative measure are given in Ta- 
ble 4 below. 



Table 4: QuartitatWe Specifications for the 
GRE Aptitude Test 











TaUt 


Discrete Mathematics 


5 


5 


5 


15 


Data Inlefprelation 


8- 10 


0 2 


0' 1 


10 


Quartftalive 
Comparisons 


10 


10 


10 


30 


Total 


23-25 


15 - 17 


15' 16 


55 



Since (hts section of ttie test is designed primarily lo measure the 
ability to reason quantitatively, fhe mathematics required does nol 
extend beyond that assumed to be common to the mathematics 
background of alJ studer>ts. Questions classified as arithmetic can 
be answered by performing arithrr^etic operations (addn subtract, 
multiply, divide, find percents or averages), by reasoning, or by a 
combination of the two. 

The algebra required does riot extend beyond that usually 
covered in a first-year high school course ^nd includes such topics 
as properties of numbers (odd ahd even integers, prime numbers, 
divisibility, and factors), operations wrth signe<j numbers, linear 
equations* factorable quadrattc equations, factoring, simplifying al* 
gebraic expressions, exponents, and radicals. The skills required 
include the amiity to solve simple equations, the ability to read and 
set up an equation for solving a complex problem, and theabiltty lo 
apply basic algebraic skills to solve unfamiliar problems. Unusual 
notation is used only when it is explicitly defined for a particutar 
question. 

The geometry is hmited primarily to measurement and intuitive 
^eom^try or spatial visualization. Topics inctude properties 
associated wjth parallet Mnes. Circles, triangles, rectangles, and 
other polygons and measurement- related concepts of arear 
perimeter, volume, the Pythagorean theorem, and angle measure in 
degrees. Knowledge of simple coordinate geometry and special 
triangles such as isosceles, equilateral, and 30 -60 -90' triangles is 
also assume<j. Knowledge of theorems and the ability to construct 
proofs that are usually learned in a formal geometry course are nOI 
measured 



Analytical Ability Measure 

Questions in this new measure are designed to tap students 
abilities to recognize logical relationships (for example, between 
evidence and a hypothesis, berween premises and a conclusion, or 
between stated facts and possible explanations): to judge the 
consistency of interrelated statements; to draw conclusions from a 
complex series of statements: to use a sequential procedure to 
eliminate incorrect choices in Order to reach a conclusion', to make 
inferences from statements expressing relationships among 
a&stract entities such as nonverbal or nonnumencal symbols: and 
to determine relationships between independent oi interdependent 



categories or groups. Three types of questions are used in measur- 
ing these analytical skills: analysis ol explanattons» logical dia* 
grams, and analytical reasoning. If ':ontinuing research should 
identify other question types that also effectively lap these skills* 
the content of the measure may gradually change tt such change is 
demonstrated to represent an improvement. As experience with the 
new measure accumulates, changes may also be made in thedirec* 
tions. to simplify and clarify them wherever possible* or in the 
formal. Such changes wiU be made under conditions designed to 
maintain score comparability from one test edition to another. 

Analysis of Explanations. Each set of analysis ot explanations 
questions is preceded by a narrative describing related events and 
a statement of a result, which may be surprising in light of the facts 
presented. Actually* the result may not follow directly from Ihe 
situation but may be dictated by other events consistent with the 
situation, although not described. One part of the student s lask is 
to Imagine what missing information might plausibly explain the 
result. Although Ihe measure is called analysis of explanations* its 
purpose may be broader than that title in^plies. It measures Ihe 
ability to recognize inconsistencies and deducible information* to 
hypothesize, and lo judge the relevance of certain facts to possible 
hypotheses or possible explanations of a slated fact. The measure 
also requires that a sequential procedure be followed in arriving at 
the correct answer. Choice A must be eliminated before choice D 
can be considered^ and so on. Since this is a fixed-format Type of 
question* each question in a set presents the same five answer 
choices. The directions and sample; questions are given below. 

OhscUons: For each set of <|uesttcnai a fact slti^atton and a result 
ara presented* Several numbered atatemenls follow the result. 
Each statement Is to be evaluated In relation to the tact f»ltuatlon 
and result. 

Consider each «tatement aeparatety from the other statements. 
For each one* examine the Following se<|uence of decfstons^ In 
Ihe order A, D. C, 0, E. Each decision results In selecting or 
eliminating a choice, rfte ffnt c/iofte ihat cannof be e/^m/nafed /s 
f/ie correct answer. 

A la the statement /ncons/stent with, or contradictory to. some* 
thing tn the fact situation* the resultr or both together? 
If soi choose A. 
If noti 

D Does Ihe statement present aposs/b/e adequate exp/anaf/on 
otthe result? 
It sOf choose B. 
It not, 

C Does the statement have to be true If Ihe fact situation and 
resultare as staled? 

If so, the statement Es dBducibt^ trom something In the fact 
' 'uatlon, Ihe resullr or l>oth together: choose C. 
It not, 

D Does the statement either support or weaken a possible 

^xplanatllon of the result? 

If so. the statement IsratovanMo an explanation: choose 
0. 

E tf not, Ihe statement Is /rre/evanf to an explanation of 

the resf It; choose E. 

Use common sense to decide whether explanations are adequate 
and whether statements are Inconsistent or deducible* No formal 
system ot logic Is presupposed. Oo not consider ^tremely un- 
llkefy or remote posslblllHes. 



16 



Situation: In an attempt to end the th«ft of boohs from Parkman 
Univoriity Library^ EInora Johnfton* tha chief ILbraHan. 
initiated a stringent Lnipectlon program at the begin- 
ning of the fall term. At the library entrance^ Johnson 
posted Inspectors to chech that each library booh leav- 
ing the bulMIng had a chechout slip bearlr^ the call 
number of ttie booh its duo datdi and the borrower s 
Identification number. The library retained a carbon 
copy of this slip as Us only record that the book had 
been chechod out> Johnson ordered the Inspectors to 
search for concealed library boohs In attache cases, 
boohbagst and all other containers large enough to 
iiold 8 booh. Since no new personnel could be hfred* all 
library personnel tooh turns serving as inspectors, 
though many complained of their embarrassment in 
conducting the searches. 

Result: During that term Margaret Zimmer stole twenty^ilve 11^ 
brary boohs. 



Directions: In this part, you are to choose from five diagrams the 
one that Illustrates the relationship among the given classes bet> 
tar than any of the other diagrams offered. 

There are three possible relationships between any fwo dffterent 
classes: 



Indicates that one class Is completely contained 
in the other, but not vice versa. 



Indicates that nelttier class Is completely con- 
tained In the other, but the two do 
have members in common* 



Indicates that there are no members common. 





1- Zimmer Stole the boohs before inspection system began. 

(Correct response A) 

2. Zimmer dropped the books out of a second'story window Into a 
clump of bushes and retrieved them after she lett the building. 

(Correct response B} 

3. During that term* If Zimmer carried a boohbagout of the library 
entrance door during regular hours, an Inspector was sup- 
posed to Chech It. (Correct response C) 

4. The doors to the library tire escapes are equipped with alarm 
bells s«t off by opening the doors. (Correct response 0} 

5. The library had at one time hept two carbon copies of each 
checkout slip. (Correct response B) 

Logical Diagrams. Logical diagrams is also a fixed -format 
measure: that is. the same opUons apply tc. each ot several sets 
questions. Students are given five circle diagrams expressing dif. 
ferent class relatjonships. They are then asked to lOOk at sets o^ 
words and choose the diagram that best illustrates the relationship 
of the concepts Ihey s^gn^^y. The }ogica\ process m'tght XQChnica\}y 
be described as consisting of three steps: 1) translating words into 
propositions that defme their relationships (as in example 3 belOw^ 
translating "fish, minnows, things that live in water" into the 
propositions that "some things live in water and '^H minnows are 
fish"); 2) diagramming those propositions; and 3) selecting from 
five diagrams the one that is appropriate to show the relationship of 
the propositions. The final step in the logical process— draw-ng the 
inference that all minnows live in water —is not required, al- 
Ihough thf3 diagram ifitjstfates that inference, it should be em- 
phasized that a student ni^ed not have studied th^s process formally 
to solve the problems^ nor wtJt the student necessariiy be aware of 
the steps of reasoning taken to select the correct answer. The pur- 
pose is to measure skills Hkeiy to have been learned in a variety Of 
contexts and in academic study of most kinds The directions anri 
sample questions foitow. 



iVofe: The size of the circles does nof Indicate relative size ot the 
classes. 

Exampie: 

Birds* robins* trees 

«.(o)0 «000 



(D) 



The correct answer* <A)* shows that one of the classes Orees^ 
has no members In common wi;h the other two. (No trees are 
either birds or robins^ and no birds or robins are trees*) {A) ^iso 
shows that one ot the two remaining classes (robins) Is com- 
pletely included In the other class (birds). 

The five possible choices for all problems in this part are gl 
below. 



Iven 





(C) 



1. Nuts. Pecans, forks (Correct response Q 

2. Adult women, inlants. black-hatred people (Correct response 
D) 



3. Fish, minnows, things that live in water (Correct response A) 



ERIC 



Ansfyllcal R*stonlng, Analytical reasoning consists of corr^olex 
sets of statements from which the student must draw inferencos. 
The statements may include abstractions such as symbols without 
spftclffc referents. The directions and sample questions appear 
below. An asterisk denotes the correct response. 

Dir0cUQn9f: Each question or group of questions Is baaed on a 
paaaageoreat of atatamantaJn am waring aome ottha questk^ns 
It may ba uaafuf to draw a rough diagram, Chooae tha beat 
anawar for aach quaation and blacken the corresponding space 
on your answer sheet, 

Qciaaf/onsf-2 

{1> H is assumed that a hatf tone is tha smallest possible mierval 

batwaan notes, 
(2> Nata T 19 a halftone higher than note V. 
id} Nata V la a whoia tone higher than note W. 

(4) Nota Wis a half tona lower than note X, 

(5) Nota X Is a whole tone lower than note T. 

(6) Nota Y Is s whole tone lower than note W. 

1. Which of tha following represents the relative order ot the 
notes from tha lowest to tha highest? 

(A) X Y W V T *{B) YWXVT {C)WVTYX 
(D)YWVTX {E)YXWVT 

2. Which of tha following atatamenta about art additional note, 2, 
could NOT ba true? 

(A) Z Is higher than T. (B) Z ie lower than V. 

(C> Z is lower than W, (D> Z 1$ between W ond Y. 
Z la between W and X. 

Ociast/on«3-4 

(1) You cannot enter unless you have a red ticket. 

(2) If you present a blue form signed by the director, you wlH 
receive a red ticket, 

(3) The diraotor will sJgn and give you a blue form i' and only If 
you surrender your yellow pass to him. 

{4) If you have a graan slip, you can exchange it for a yellow 
pass, but you can do so only if you aiso have a blue form 
aignad by the director, 

(5) In order to get a red ticket, a person who does not have a 
driver's license must have a blue form signed by the director. 

(6) You can gel a yellow pass on request, but you can do so only 
If you have never had a green slip. 

3. Tha above preceduraa fall to specify 

*(A) whether anything besldea a red ticket it required for en* 
trance 

(B) whether you can exchange a graan slip for a yellow pass 

(C) the condition under which the direotor will atgn the blue 
form 

(D) how to get a red ticket If you have a yellow pass 

(E) whether It la poaalble to obtain a red ticket if you do not 
have a driver's Itcsnsa 



4. Which of tha following people can, under the rules given, 
eventually obtain a ticket? 

I, A peraon who has no driver's license and who has only a 
green slip 

tL A person who has no driver's tlcense and who has only a 
yellow pass 

ill. A person who has both a driver's license and a blue form 
signed by the dirai^or 

(A) I only (B) 11 only (C) I and II Only 
^(D) II and fil only (£) L \U and III 

Contant Specifications for tha Analytical Ability Maaaur^ 

The content specifications for the analytical ability measure are 
based on achieving an approximate baiance between questions 
with greater face validity for students with a humanities-social- 
studies orientation and those with greater face validity for students 
with a science orientation {though those calegories are clearly not 
exclusive or independent). The specifications now call for 40 
analysis of explanations questions (appearing more closely related 
to kinds ol analysis used in the humanities and social studies) and 
30 questions (15 logical diagrams and 15 analytical reasoning) ap^ 
peering more closely related to the kind of analysis required in the 
sciences. Since the analytical measure was introduced for the first 
time in October of 1977. a detailed breakdown of specifications is 
not yet in final form^ Diversity of subject matter and questions is the 
geaeraJ rule. 



Statlatlcal Characteristics 

item and test analyses, which are regularly performed for every new 
test form introduced, provide information on the statistical charac* 
teristics of the test and its components. Tha most important siatis- 
tical information indicates the difficulty, reliability, interrelationship 
of test components, and speededness^ Data on these characteris- 
tics for five rocent Aptitude Test forms administered prior to 
October 1^77 are shown in Tabte 5, and data for the first tvvo 
restructured test forms administered in October 1977 are shown in 
Table 5A. The analyses providing these data were based on samples 
representative of the administrations in which the respective tests 
were introduced rather tiian the total GRE population in a givers 
year or years. Of the five prior forms shown in Table 5, three were 
introduced in April and two in October The October examinees are 
consistently more able, on the average, than the April examinees^ 

The reliability estimate for ttie verbal sections of a typical prior 
form of the Aptitude Test is ,93 end for the quantitative section .9'[. 
wUh corresponding standard erroi^ of measurement of 33 and 40 
converted scaled score values, respectiveiyn* Taken separately, the 
two verbal components— discrete verbal questions and reading 
comprehension — have reliabilities in the middle to upper .80s. 
Thus, the reliability of each of the verbal sections is higher than the 
intercorrelatron between the two. which is in the middle .70s. sug- 
gesting lhat the two verbal components contribute somewhat inde- 
pendent indicators of verbal ability. 

The correlation between the verbal ability scor« and the QiJentita, 
tiv ''bility score is about .56 in a typical prior form The correlation 
be^ .en the discrete verbal and quantitative sections is AS and the 



Reliabihly II e*timare<J by ifie Kuder- Richards on rormola i20). odaPi&tJ (or u*e wiih for- 
mula scDnrt? 



t8 



EI^C 



Table 5: Statlsllcal CharACteristlcs of FJve Recent Prior Forms 
of the GRE Aptitude Test 





StVtlltfCt ItH 




G«4Vpl 














of f otil 












Numbw of 






Nu*nb*t 
«t 


























4 Sect** 












tton 




Q 










V 


Q 


V 


Q 


nw 


V 


Q 


V 


Q 


V4Q 














Aprit 1973 




200 
fi30 


200 

aso 


485 

SD 
US 


490 
SD 
133 


1.090 


S3% 


57% 


93 


91 


.56 


34% 


98% 


57% 


SS% 


73% 




OctDbrr 1373 


33.427 


220 
S40 


200 
850 


S19 
SO 
120 


519 
SD 
13; 


1.115 




59% 


93 


90 


.53 


3*% 


97% 


97% 


55% 


7t% 


50% 


^rH 1974 


27.287 


200 
350 


200 
320 


47^ 
SD 
127 


489 
SP 
134 


M30 


51% 


5?% 


93 


91 


.S3 


92% 


97% 


96% 


52% 


76% 


44% 


Aprit 1975 


2S.978 


200 
ft60 


200 
320 


476 
SO 
126 


486 
SO 
134 


L6D0 


5S% 


57% 


.93 


91 


56 


W% 


93% 


96% 


S3% 


78% 


49% 


Octobrr 1976 


29.229 


200 
SBO 


200 
S40 


S12 
SO 
131 


52S 
SD 
137 


1.960 


59% 


63% 


.94 


9t 


.60 


96% 


98% 


96% 


66% 


S4% 


36% 



*Th* va^t^»J ^u*itioni of the Prior Aptitude Test wftrp m two sppa- was sWtn in 25 mi/iutei. SactJon )) contiJnod reading comPrehani*on 

ratsly timed section^. Section I contained discrete Pu^sttOn^ ^nd Questions and wai ^ivfrn in SO minutssn 



Table 5A; StaHstical Characteristics of the First Two Restructured Forms 

of the GRE Aptitude Test 

(Based oa Two Separate Evenly Spaced Samples of lr950 and 1»945 
£xamtnees the October 1977 Admmistratfon) 





lutii bam 


Mm 






4«w 








V i i 


<} ii 










lAttn f« 






Q 






0 




y 




i 


y 




• 


• 










Sk. til Sk. if 




I 


220 




210 






510 




60% 5^% 










77 


7* 


M% 


98% 


!I5% 97% 


31% 




73% 60% 








eio 


50 


so 


SO 








































133 


u: 






























z 


210 


^x> 


210 


^1 


5:^5 


515 








es 




U 




;i 




99% 


96% 93% 


2*% 


^% 


?9% 63% 




8^ 


£70 


aoo 


SO 


SO 


sn 






































125 




129 




J 


L 

























*Th* analytic»i guftitioi^ft of tti* r4Mtructur«d Aptitude Test are in two loSticai diagrams aad analytical reasoninfi Quastions ^nd is givan in 

separataly timed sactions. SectiOa III coatams analysis-of-exPlana- 25 mmules- 

tions Questions and is given m 25 mmuies. Seciion I v contains 



correlation between the reading comprehensfOn and quantjtative 
sections is .59^ 

For the first two forms of the present restructured Aptitude Test 
adnritntstered in October t977. me refiabiiity of the verbaf abifity 
measure, which has discrete verbal and reading comiprehension 
questions combined in one section, remains at .93. The reliability of 
the Shortened quantitative abiJity section is .S&. and that for the new 
analytical abiMty measure .92. Standard errors of measurement for 
the scores on the restructured test are 33 for verbal ability^ 38 for 
quantitative ability, and 36 for analytical abihty. 

The correlation between the verbal ability and quantirative ability 
scores on the restrurtured test is approxjmatefv .54; between the 
verbal ability and analytical ability scores 73, and between the 
quantitattve abifity and anatytical ability scores 70 



One set of standards sometimes laken to indicate that a test is a 
power test and tacks any significant speed faclor is that virtually all 
examinees reach three-fourths of the questions and SO percent 
reach the last question. The percentage comple^mg three-fourths 
of the test is the more reliabte indicator because the percentage 
completing the test depends entirely on the nomber answering the 
very last question^ Often there is quite a large difference between 
the Percentage reaching the next-to-last question and the per- 
centage reaching the last question. 

In terms of this 3dt of standards, the test sections of five recent 
forms of the prior Aptitude Test were stightly speeded for those la*^- 
'ng them. The percen!ages completing three-fourths of the test sec- 
tions ranged from 92 to 08 percent with a median value of 9o 
Percent. The verbaf sections of !he first two restructured forms are 

19 



ERIC 



somewhat Speedod, but the oiher sections are only slightly 
Speeded, 

Another approacli to investigating speededness is (actor 
analysis, factor analyses were performed on two forms of the GRE 
Aptitude Test given in October 1975 (Powers. Swinton. & Carlson, 
1977). The results showed lhat speededness associated with the 
discrete verbal questions accounted (or 6.2 and 7.7 percent of the 
common variance of the first and second forms, respectively, 
whereas the factor reffecting speededness m the reading 
comprehension passages explained only 2, 5 and percent, A fac- 
tor of quantitative speed, accounting for 2.S percent oi the common 
variance, emerged as a separate factor in the second form only 

Stattstlcai Specifications 

The statistical specifications for each form of tl^^ test are fairly 
constant and change only gradually and tor compelling reasons. 
The purpose o( such stability in specifications is to asiiiure parallel 
forms of the test and thus comparability of scores regardless of 
form. Statistical adjustments lor remaining unavoidable differences 
in test forms are thus smaller and Jess suscepti&le to error than if 
forms were widely divergent. 

Essential to the effectiveness of detailed, fixed specifications is 
the pretesting process. Since all questions used tn the Aptitude 
Test are tried out experimentally on the regular GRE population, 
without, being counted toward students' scores, the statistical 
characteristics of indivtduat questions are known and can be used 
in selecting Questions that will meet \h% statistical specifications 
and result in an appropriate test for the population. 

Th6 primary statistical specifications are: 1) difficulty of the test 
(expressed as a mean delta for questions), 2) range of question 
difficulties, and 3} mean Que?;ion-test correlation (r*tnserial). The 
test assembler knows the difficulty and question-test correlation of 
each question in the usable pool Thus, the test can be constructed 
to provide a fufl range of scores^ rather than measure at one end or 
at the middle of the scale only, and appropriate total reliability The 
statistical specifications for the APtitude Test appear in Table 6 



Table 6: statistical Specifications for ttie 
GRE Aptitude Test 







standi rd 








Oniithtn 






Otitf 


I>«ft<i 


r SI Mr 111 


Verb^il 


12.3 J2.7 


2.5 ' 2.8 


43 ' .47 


Qu^ntitAtjve 


12 3' 12.7 


2.8 ' 3.0 


.50 ' .55* 


Analytical 


J2.3 12.7 




.43 .47 


{tentative) 









'Smc« thtre 3ft fewer Quanlrtattve Queitton* than verbal and analytical 
qu«3ttani. the mean r Ui4*r+al mu4t b* htgher to obtam appropriate 
reltabiltty 



Aithough these specifications do not provide separate require- 
ments for each type of question wilhm a measure, wide variations in 
the statistical characteristics o( the components of the test are not 
tolerated. For example, the reading comprehension questions can- 
not t^ave a mean delta of 8 while the mean delta ot the discrete 
verbal questions is 16. The mean delta range considered accept- 
able for the various types Of questions is approxtmately 11 to 1^ 
The specifications are not always met precisely for a given adminis- 
tration in Which the test is analyzed because the populations; takrnq 

20 



the test at different times of the year vary. This variation affects 
resunsof me analysis of a given ImaUofm and o1 pfelesl malenaL 

Reratlonship of Statlsttcal Analysis and Research 
to Test Specifications 

Research and statistical analysis Play a major role bott^ in setting 
specifications for the Aptitude Test and meeting specifications 
for a final form with pretested and statistically analyzed questions. 
Research that is closely related to the Aptitude Test is not usually 
formalty reported or published because its primary audience is test 
developers wtio can act on the results. In some cases, materials 
intended for use a future form are the subject of the research and 
would be compromised by publication. However, such research is 
an important part of the test^making process^ 

Standard Activities 

After every form of the test is introduced, it is analyzed staiistically. 
In addition. Item (question) analyses are performed giving informa- 
tion on each question in the test, ttem analyses for the pretest are 
parficulariy important because they enable the test specialist to 
identity and eliminate questions that are not performing 
consistently with similar questions in the final form or witn the rest 
of those in the pretest section. Pretesting and the ilem analysis 
process provide the necessary data for meeting the specifications 
for a final form of the test^ 

Statistical information is also necessary for revising specifica- 
tions where necessary, for example, in test analyses performed 
prior lo 1972* it was noted that Seclion I of the verbal measure ap- 
peared to be speeded: that is* not enough people were reaching the 
last question, too few people were reaching three-fourths of the 
questions, and the variance of questions not reached was too high 
relative to the variance of scores, for this reason the test specitica- 
tiOns were Changed in 1972 from requiring $0 discrete verbal ques- 
tions in that section to 55, Since the reliability of the verbal measure 
was welt above .90, reirat)jlity was not threatened by this change. A 
gradual decrease in the number of word problems in the quantita- 
tive section of the test (that is, a reduction in the number of ques*^ 
tions requiring a great deal of reading to understand and then solve 
the problem) is due to careful investigation of the correlation of the 
verbal and quantitative measures as well as the phlfosophical 
stance that the verbal and quantitative measures should be as inde- 
pendent as possibie. 

Special Activities 

Periodically, other kinds of research or statistical analyses are car- 
ried out to evaluate or explore the possibility of changing test 
specifications. Anafyses carried out in other programs at Hf^S (such 
as the Law School Admission Test Program, the Graduate Manage- 
ment Admission Test Program, and the College Entrance Examina^ 
tion Board s Admfssion Testing Program) may also contribute to 
the GRE test development process. Examples of analyses for othor 
programs that have had some influence on the thinking of test 
deveioPers working on the GRE are: 1) criterion vaNdily research 
done by the Law School Admission Test Program on a type of ques- 
tion considered by the GRE for the newly restructured Aptitude 
Test. 2) factor analyses of the Scholastic Aptitude Test (which until 
very recently fiad content similar to the GRE Aptitude Test, Ihough 
not as difficult); and 3} coaching studies carried out by the College 



Board and other lesi sponbors to ot^teni^ine whether aphiude leSi 
questions are. indeo<f. measuring skiKs de^/eloped over a fong pe- 
riod rather ihari skills that can be learned m a brief crammmg 
session 

Coaching sio<*ies in a variety ot settrogs and for a number ot 
groups have shown that special cramming sesstons or study ot lest- 
takmg strategies tor aptitude tests cannot substantially improve 
performance on quantitative ar>d verbaJ questions like those that 
have traditionally been part of the GRE Aptitude Test. However. >n a 
recent SA7 study (Pike and Evans. 1972) m which coaching was 
redefined to suggest a substantial component ot instruction in 
mathematics as welt, performance on the quantitative questions — 
quant»t3tive comparisons in particufar— was improved. A simitar 
study was initiated to test the findings m a GRE context, but the 
results were not interpretable because of the scanty data resulting 
fforn a htgh <fropout among subjects in the experiment. Neverthe- 
less, the SAT tindtngs suggest that significant instruction in quanti- 
tative Skills^ as opposed to instruction primarily in test-taking 
strategies can be effective and may be reflected in test scores it 
cannot be concluded that quantitative comparisons are in the usual 
sense coachabre since coaching and mathematics instruction were 
combined in the Pike and tvans study showing score improvement 
on that type of question The GRE tntorfnatton Bulletin contains 
sample quantitative comparisons, an<f a fu/J complement of quanti- 
tative comparisons is included m the GrE Sample Aptttude Test, 
both o( which are accessible lo all students. It is assumed that any 
Possible coachjng effect— if suCh ar> effect shoui<J exist separateJy 
from instruction — will be standard for all students it all become fa- 
miliar With the question type before taking the test. Future research 
IS expected to explore further the question of coachabiltty and 
instruction m the widely applicable skills measured by the GRE Ap- 
titude Test 

The Sample Aptitude Test was first published in t975-76 lo give 
all students equal access to information on the kinds o' questions 
in the test and equal opportunity to become famtfiar with them and 
ways to sol\/e them before taking the test Another reason for pro- 
viding the Sample Aptitude Test was to make it possible lor 
students to obtain more intorrnation on the lest wrthout turning to 
marketed matenais thai^ though designeci to prepare students tor 
the GRE. may not actually parallel the conteni of the GRE 

Research on subpopulations has been done in the GRE Program 
and in other testing programs to determine the interaction ot test 
content and students performance. For example, in one study the 
performance ot males; females, blacks, and whites has been 
analyzed to determine whether different kinds of questions or 
topTcaf material have differential effects on the question difficulties 
for those various groups (See Population validity mChapterfe) 

StiH other special subpopulation studies concern appropriate- 
ness of the timing an<f <firections of the test Currentfy^ a stu<fy is 
underway to determine whether allowing additional time for verbal 
and quantitative questions will have differential effects on 
subgroups of the popuJatjor> ^deritit^ed by age sex. or ethnrc 
characteristics Research is also underway to examine closely the 
guessing Procedures students use, students altitudes toward 
guessiny as engendered by the test instructions, and the possible 
differential etfects of various guessing instructions on different 
subgroups of the population results of rnis research show thai 
formula scqrmg or guessing instructK'tn^, m^ty bo wnrkinq to ih*^ 
disadvantage ot some students, the method of sroring or iho woro 
mg of the directions wiri be reCOnSrdereJ 



Research Related to Restructuring the Aptitude Test 

Perhaps the relationship between statistical anatysis and research 
and the Aptitude Test development Process fs best iffustrate<f by the 
research effort precedmg the decision to restructure the Aptitude 
Test to include shortened versions of the verbal and quantitative 
rneasures and t a new anaJyticaf measure. Because of the 
significance of the proposed change, the Graduate Record Exami- 
nations Board, particularly the Research Committee. mOnilOred all 
stages of the research, determining what questions seemed to 
merit further investigation and makii^g final decisions concerning 
results. The extensive research effort was focused on possible ways 
ot broadening the definition of talent measured by the GRE Ap- 
titude Test. 

The verbaf and quantitative ability n^easures. both of which wer^y 
respected for the usetulness they had demonstrated over the years, 
were examined to see whether they could be made more useful and 
Whether they couJd be shortened to make room for possible new 
measures, should they become available. Concurrently, research 
was undertaken to determine whether a supplemental measure 
coutd be designed that woutd allow students to demonstrate a 
broader range of skills and permit educational institutions to better 
|u<fge the aca<femic qualifications of their appficants. 

Several methods of study were used: constituency surveys (ques- 
tionnaires addressed to students, faculty members, and administra- 
tors). exper^mentaJ pretexting foiJowed by item and test analyses, 
factor analyses, validity research using self-reported under- 
graduate grades as the critehonn and some analyses for special 
subgroups of the population. 

Research Refatecf to Changes \n the Verbal ancf Quantttatfve 
Measures. Research related to the verbat and quantitative ability 
measures was intended to show whether these measures could be 
shortened to make room for a new measure and whether the 
diversified content of the reading comprehension measure coutd 
be replaced by specialized material to be sefected by students on 
the basis of their undergraduate background. It was hoped that 
specialized reading material would increase the validity Of the 
verbal measure an<f provi<fe a usefuf subscore without affecting the 
comparability of total verbal scores. The investigation focused on 
thefoltowing questions 

1. Can the GrE verbal and quantitative ability measures be 
shortened'? 

(a) What effect wiU a reduction in the number of reading 
comprehension questions have? 

(b) What effect will the introduction of relatively short reading 
comprehension passages have? 

(c} What effect will the introduction of quantitative comparison 
questions have on the quantitative measure'? 

2 Can reading comprehension subscores be based on different 
reading selections for students with different undergraduate ma- 
jors'? Are total verba! scores^ base<l on reading matertal cor- 
responding to undergraduate background, comparable to past 
reported tota) verbal scores based on <ftversihed reading n^a- 
terial? 

A rolnred snhsequent question was If ii is not feasible to provide 
suDscores based on niaieti^! setected by studcr^t:- according to 
ma^or f^eid. can a readrng comprehensior> subscore based on com- 



2t 



Op«r«tlonal T«tt 



V«rtMil 
Untonymtt 

9#nt*nc« 

compt«tlofi«) 



50mln, 



Exp«rim«nt«IT9tt 

25mir>. 



Cofiq>rth«n*h>n 
topic*) 



75 min. 




Hiimanitlet- 
Social Science 



25 min. 



Natiirai 
Science 
paetaget 



25inin. 



25min. 



25mlr>. 



Quanltatlve 
(dWereilled) 



Data 
Interpretation 



QuantHetive 
Comparttont 



Regular 
Math 

ProlHent 
Solving 



25mln. 

Quantitative 
Comperleona 
4 Problem 
Solving 



mon material tor alt studer>ts be provided? decau^se of the ^'Ql^ 
tercorreiation of such a sobscore with the total verbal scoren the 
answer 1o this question was ''no." 

Several methods of ir>vestigalior> were used. A survey of depart- 
mental re presents lives and administrators was made in a special 
questionnaire in the GRE Board Newsfetter. Short experimental 
variatior>s or the reading comprehension and quantitative measures 
were developed and incJuded as triaf material in a regular adminiS' 
tration of the GRE Aptitude Test. It was then possible to compare H 
the combination of operational discrete verba* questions and the 
SOmmute operatior>al section of reading comprehension questions 
with an experimental combination of the operational discrete 
verbal section and tha25-minute experimental reading comprehend 
sion section, and 2) the 75^miniite quantitative measure containing 
three types of questions with each of four 25^minute experimental 
sections. The comparisons are diagrammed above. Factor analyses 
were performed to test the potential usefulness of theexperimenlai 
material. 

Research results suggested that the verbal and quantitative 
measures could be shortened without altenng the original func- 
tions of these measures or the comparability of scores on the 
original and new versions, Th^ verbal measure could contain 25 
instead of 40 reading comprehension questions without falling 
below h90 in reJiability. Because of the lower proportion of reading 
comprehension questions, which have e higher correlation with 
quantitative scores than discrete verbal questions, the separate- 
r>ess of the verbal and quantitative ability measures would, in fact, 
be enhanced. However, optional reading comprehension sections 
based only on speciaiiz-jd topical material would result in lacK of 
equivalence between total verbal scores or^ the original test and The 
new test. The inclusion in the quantitative section of a number of 
quarttitative comparisofis would nol noticeably alter the factor 
structure of that measure or its comparal>it)ty to the Original test 



fteaeerch fteleted to the New Analytical Meaaure. At the GRE 

Board's directionn seven types of questions intended to measure 
various aspects of reasoning ability were deveJoped^ Various 
sources of questions Purported to measure reasoning or analytical 
skills were examined, such as some of the measures in the French 
Factor Kit. components of the Law School Admission Test and 
Graduate Management Admission Test^ the Walaon*Glaser Test oi 
Critical Thinking. anC the Cornell Test of Critical Thinking, 
Howevern the emphasis was on creating new question typeSn each 
intended to tap a different aspect of reasoning or analytical skills. 
When pretests of each of the seven question types administered 
with the regular GRE Aptitude Test were analyzed, a number of 
questions were posed: 1) Will the new test questions yieJd material 
that is appropriately difficult, reliable, and unspeeded? 2) Will they 
measure sktNs that are relatively independent of verbal and quar^ti- 
tattve abilities? 3) Will they be vatk) in relation to the criterion of 
self*reported undergraduate grades? 4) What combination of the 
new test questions^ if any^ would be appropriate to create a new 
measure that would add to the value of the Aptitude Test? 

To provide answers to these questions^ each type of question was 
pretested in one of three regularly schedufed GRE national 
administrations. Each pretest was taken by a substantia) number of 
GRE examinees. For all but one question type, at least three sam* 
pies were drawn: a representative (spaced) sample of all students, a 
sample of biological and physical science undergraduate majors^ 
and a sample of humanities and sociaf science urnjergraduate ma- 
tors. In addition, separate analyses for one pretest were based on 
samples of black mates^ black females, white maies. and white fe- 
males. 

The efficiency, criterion validity, difficulty. reliabiJityn speeded* 
ness. correlation with verbal and quantitative ability measures, and 
appropriatene^ for students with different academic backgrounds 
were investigated for each type of question. To assess the face 



22 



T^ble 7: A Comparison of Various Experimental Question Types 















Crltirlwt 
Validity 


Sc«r* tatlw 
S««rt 


Tlm# JMirJf«iwftt 


tlKH 


For 

kM>«fllfVf 


Decefnber 1975 


Letter Sets 


92 


7.9 


7 mm Pflr Question 


45% 


"% 


.22 




.65 


19 7i and 




















December 1976 


Reasoning 




12 13 


1 .2 niin per Question 


66% 


67% 


.25 


.90 


.68 


1^73 and 

* ^ T J 0IIU 






















Reasoning 


69'^ 


13 13 


1 2 min per Question 


50% 


79% 


.25 


7B 


78 




EvaiuatKtfi 


















December 1975 


of Evidence 


76 


13.8 


6 min par Question 


57% 


76% 


.22 


.73 


.59 




Analysis of 


















D*c**nber 1^75 


Explanations 


78 


13 3 


.6 mifl per Question 


57% 


n% 


.27 


.73 


£6 




* Logical 












.16. 18 




.77 


D»a grans 


9? 


11 G 


.5 min per Queslion 


52% 


83% 


.67 




Deductive 


















June 1976 


Reasoning 


.67< 


11.8 


1 B min per question 






,13 


.52 


.79 



^Estimated by the Kudef Richardson formufa (20) adapted for use with 
form 11 scoring 

iDtWicutty Is £ivin >n tarms o( the delta scait. with a me*n of 13 and 
« standard deviation of 4 For tiVt-chC^iCt Questions such as thtsr. 
rriiddle difltcurty is a deha 12 {^0% Answered corractty. 50% theo^ 
redcap "ko^w" the amtwer an^ lO'^^ ftuflssed correct/y^ {Sf« »fso 
PaS<^ ?S and TTisPtrr 5 > 



.^naliat>fltties for losicaf repsoniriS tt»d inttrytfcaf r«esonJnff nave been 
adiusted so that ttiiy are comparable to tVi» other rallabtllllas. based 
on 25,mlnut* eecltons 

jThis Trliebtlity fiSur* is infiitad by sPeedednessr 
*-Uo student Quasttonneire data are ivihable. 
■'CorrflCled for attenuation 



vaJidity of each question Jype and the way in wh;ch Cifierertt groups 
perceived its utility, surveys were administered to samples of 
students who had taken each pretest, and two student committees 
offered opinions about samples of the experimental Questions. 
Presentations were made at a r^umber or national and regional 
meetings of professional associations, and the questions were 
briefly discussed by some GRE Advanced Test committees of 
examiners. As a resutt, a ntJmbef of decisions could bemadeabool 
the appropriateness o1 each oi the seven question types as a possi- 
ble part of a new measure. 

Statistical characteristics of each ^ype of question are indicated 
in Table 7. The asterisked question types are included in the new 
afialytical measure 

References 

Coffman. W. E. Prmcip^es of developing tests for the culturaNy dif- 
ferent. PrOCeed;rT9s of the 1964 fn^italionai Conterer)Ce on Test* 
ing Problems. Princeton, N.J : Educational Testing Service. 1965. 



EducationaJ Testing Service. 4 confidential testing sorvioo for use 
by cottages and univarsittas io fyrograrr^s of aPQrHisat sataction. 
and guidance {The Graduate Record Examination General 
Bulletin Number 2). f^ew York: Educational Testing Service. 
October 1948. 

Educational Testing Service, GfiE NaUooai AdrT^mrstratrons, 1977- 
7$ Sojpefv/sor's Wflnt/ar Princeton, N.J.: Educational Testing 
Service. 1977, 

Graduate Record Examinations Board, ^e^s/er/er Princeton. NJ.: 
Educational Testing Service* Sept ember- October 1975. 

Pike. L, W,. i Evans. F. R. Effects of speciaf instrtjctior) for fftree 
kinds of mathematics aptitude items (CEEB Research Report f). 
New York: College Entrance Examinations Board* 1972. 
(Monograph) 

Powars. D. E,. Swinton^ S, S.t 4 CarJson, A. B. f< factor analytic study 
of ffte Gflf APtitude Test (GRE Board Professior>al Report 75- 
11 P). Princeton, N.J. EducatiOnai Testing Service. August 1977 



23 



Chapter 4 

DEVELOPMENT OF THE ADVANCED TESTS 



Applicants for graduate sludy can demonstrate some of their 
qualificatrons by taking an Advanced Test in the discipline they 
studied as undergraduates. Beginning in October 1976. there were 
Advanced Tests in each of 20 disciplines: biology, chemistry' com^ 
puter sciencep economics, education, enginoermg. French, 
geography, geology- German, history^ literature in English, 
mathematics, music, philosophy, physics, political science, 
psychology, sociology^ and Spanish. 

Decisions concerning the appropriateness of providing a test are 
based on criteria developed by the GRE Board in 1967. These cri- 
teria, given beiow. generally reflect the primary purpose of an Ad- 
vanced Test: to provide a high quality measure of undergraduate 
achievement to aid in assessing the preparedness of students for 
graduate study m a major discipline^ 

1. A significant number of institutions shou^c! offer a graduate 
program in Ihelietd undef considefalion. 

2. There should be a significant number of qualified graduate 
faculty members whose teaching and research activities are in 
that field primarily. 

3. A significant number of matriculated graduate students should 
be studying in that field primarily. 

4. There should exist one or more appropriate learned societies 
or professional associations in that field wfiich publish one or 
more scnolarly journals in which originat research articles are 
published. One or more appropriate learned societies or 
professional associations should express an interest in the es- 
tablishmeni of any new test offering and a willingness to 
cooperate in its development. 

5. The field shourd be sufficiently homogeneous so that a test 
with satisfactory psychometric jhar acteristics can be 
developed. 

6. There should be good reason to think that continued vafida- 
tion of the test by appropriate methods wiM yield satisfacto>V 
results. 

7. The field should be amenaWe lo techniques oi testing that can 
be reliably scored. 

B, There should be a sufficient number of potential candidates in 
any proposed new field so that adequate statistical data (e g . 
scaling, norming) can be obtained on the first administration ol 
a test in that field. 

9. A test in the field should i>e amenable tc standardized test 
administration procedures. 

10 The field should be amenable to testing techniques which are 
comparable in spirit and quality to those used in the other Ad- 
vanced Tests. 

11. Introduction or continuation of a test m any one field should 
not impose an undue financi;ii <tr;iin on iht* GRF Nafonai 
Program as a whole. 

24 

o 

ERIC 



12. There should be a demonstrable need for a test in the field 
which is not met adequately by other available instruments or 
fasting programs. 

The Advanced Tests are constructed on the assumption, borne 
out by evidence from varidity studies* that graduate school perfor* 
mance is related to achievement at the undergraduate leveK Thus, 
the tests focus on measurement of learning in undergraduate cur- 
ricula. Because the tests are intende<l primarily for grr^duate ap* 
plicants. on the average a more able population th^n all under* 
graduate majors in a field, the tests in some cad» .tiay be relatively 
difficult for the average senior who does not plan to continue study^ 
However, the tests are designed to cover the material that would be 
encountered by the average senior majoring In a field. Students 
who move from one undergraduate field to another graduate field 
will tend to find the test in their undergraduate major field to be 
more appropriate than the test offered in the field they plan to enters 

One of the main advantages of the tests is the standard measure 
of competence they provide. The scores reflect the relative stand- 
ing of all students on the same measure. A second advantage is that 
subscores. reported for 9 of the 20 tests, show strengths and weak- 
nesses In particular subfields of the disciplines. 

The Advanced Tests also have limitations. Students who spe- 
cialize very early and who do not have a broad background in the 
field may find the coverage of a test inappropriate. The tests must 
focus on topics to which the majority of students in a field have 
been exposed, in some fields^ which incorporate a wide variety of 
relatively ir>dependent subfields— such as education and engineer- 
ing, for example — it is a particular challenge to find the ' core" of 
knowledge that is common to alL 

Uses 

GRE Advanced Test scores generally are used to assist in making 
decisions on admission to graduate programs and in awarding 
feNowships. The total scores serve this Purpose and some sub* 
scores are sufficiently reliable (near ^90) to be used In making act- 
mission decisions as weih The subscores are especially useful in 
counseling admitted s\vjdents and helping them decide what 
courses they should take^ Other uses of the Advanced Tests are as 
indicators of the effectit'eness of an undergraduate or'master's 
program and as comprehensive examinations at the undergraduate 
leveL (The "Guidelines for the Use of the GRE" in the Guide to the 
use of frte Grac/uafe Rscord Examinattorts includes a list of appro- 
priate uses of the Advanced Tests,) To properly use the tests, it is 
important that the test content be reviewed, pertinent information 
in addition to test scores be considered^ and the relationships 
between measures of the Qualifications of students for graduate 
study and measures of their later success in graduate study be de- 
termined and recorded on a continuing basis (see the discussion of 
validity in Chapter 6). 

Some graduate departments require Advanced Test scores, 
others recommend them, and still others recommend them under 
certain circumstances. The number and penr^entage of graduate 



Tsble : Poltcles of Graduate School Oepartmants Dated tn the Graduate Programs and 
Adlwl99lon9 Manual on Use of the ORE Advanced Tests for the 20 Fields of 
Study Whose Names Match or Closely Match Those of the Tests 





























tf* 




AdvMMd tttt tn 
















C*rtiln C*ui 


Fon^n 


t 




t«tt 






% 


No. 


% 


No. 


% 


No- 


% 


No. 


Biology 


Biology 


1 


30 


r 


12 


■ 

31 


10 


89 


28 


319 


Chemistry 


Chemistry 




24 




15 


84 


27 


103 


33 


311 


Computer 


Computer 
















Science 


'Science 








New— no data 








Economics 


Economtcs 


/a 


Ob 


1 


7 


40 


19 


77 


37 


207 


Education 


Education. 






















General 


15 


12 


1 1 


8 


18 


14 


86 


66 


130 


tngi nee ring 


Engineering' 




12. 


i Ob 


22 


163 


26 


255 


40 


631 


French 


French 


49 


37 


1 o 


14 


19 


14 


46 


35 


13a 


Geography 


Geoftraphy 




lb 


14 


u 


26 


20 


68 


53 


128 


Geology 


Geosciences 




62 


4 / 


28 


16 


10 


51 


30 


167 


German 


German 


24 


22 


11 


10 


29 


26 


47 


42 


lU 


History 


History 


101 


32 


42 


13 


49 


15 


127 


40 


319 


Literature in 


EngUsh 


133 


39 


40 


12 


40 


12 


13a 


38 


345 


EngNsh 






















Mathematics 


Mathematics 


69 


22 


43 


13 


78 


24 


129 


40 


319 


Music 


Music 


54 


24 


16 


7 


24 


11 


129 


58 


223 


PJiiJosophy 


Phjlosophy 


28 


19 


13 


9 


47 


3] 


62 


41 


150 


PfiySics 


Physics 


81 


32 


25 


10 


79 


32 


65 


2e 


250 


Political 


Pohticat 




















Science 


Scrence 


67 


30 


26 


ta 


48 


Zl 


84 


37 


255 


Psychology 


Psychology 


U3 


48 


26 


9 


39 


13 


88 


30 


296 


Sociology 


Sociology 


ea 


30 


21 


10 


47 


22 


82 


38 


213 


Spanish 


Spanish 


13 


19 


11 


16 


11 


16 


35 


50 


70 



*1ncfud«s Sfrnefaln ch«rnicalr ctvil. «i«ctricaL industriaL acid mechanical «ngm«ering. 



school departments requiring or recommending lhaf students 
provide GRE scores in the 20 fields of graduate study for which Ad- 
vanced Tests are Offered are shown m Table S. These data ara taken 
from the G/-dtfuafe Programs and Admisstons ManusL l9?6-77 edi- 
tiort/ 

Format 

All the questions (n each Advanced Test are of the multipJe-choice 
type. Eighteen tests hdve five-choice Questior^s exclusively: tf\e ^d^ 
vanced German Test has four-choice questions exclusively: and the 
Advanced Spanish Test has both four- and t've-choiceqi^estions. 

Various kinds of mu^t(p*e-choice questions appear m the tests. 
Most are independent items in which a question or incomplete 
statement is followed t>y five (or four, as mentioned earlier} sug- 
gested answers or completions. The examinee is (o choose the Orie 
cptlon that best answers the question or completes the statement 
Many tests contain sets of questions In which all questions the 
set relate to one topic: ordmariiy. a t>ody of information on the toPic 



StAt«t and aboul Tfi* Qraduais PfOgriflii inev orter ^9 r>a|Or tieid? Tne S^^ii^^dtu^cns 
t*Qr^n«^\^*a m th« Ma/iun Atm at(«rtd«d Dy mort inarL 86 nftrcent oi aif ^ra^ujtt; si jtlenr? 

khtt oU%t mtst«r a or highfir titrgt*^ ,rt mft ne^rJa oi «f3uC4iiOn anc) rr>e iiC?ArAi ami ^i^a 
ac^«nO»iM*r« in^fiTeCf to supply inrormaTtOn tor Jh* Miinwtf 



is Presented at the beginning of the set. Sets can lead to probing 
more deeply into a topic than is generally possible with inde- 
pendent questions. Extensive information cannot be presented 
with each individual question if a test is to t>e unspeeded. As many 
as 10 questions are contained in some sets, although 3 to 5 ques- 
tions .ire more usual 

In some sets of <juestlons. live preset answer chorees in the form 
of wordSn sentences, graphs, equations, diagrams* charts, or 
symt>ois are presented and serve as the options for several ques- 
tions that foJJow^ These questions can usually be answered more 
rapidly than independent questions t>ecau3e the same five options 
are used repeatedly. Th9 five options must be chosen in such away 
that they possess a reasonable decree of relatedness, and aN five 
have some plausibility as answers to each question in the set. Yet 
they cannot be so closely related as to be nearly synonymous, thus 
leading to two or more correct answers to a 9iven question. 

Some Questions re<}uire the e^^aminee to select the incorrect, 
least hkely. or exceptional response^ The nature of such a task, 
which is the reverse of the usual pattern, ic communicated with em- 
phasis by capitalizing words such as j^tcoRRECtn least, or except 
ir^ the stem of the ques;ions^ There is some evidence that such 
questions are more difficult than those in the usual pattern. It does 
seem important to be able xo reason toward the »denf(ficatlon of e?t- 
ceptions as well as correct answers, and some questions f^nd 
themselves more comfortably to this format rhan to tho standard 
pattern. 



25 



ERIC 



A numb«r ol qu«itlons in the tests are arranged to allow for 
multiple response within ttie tramework ot recofdmi^ only one 
mark on the answer sheet lor each queatron. A queation of this type 
may takatha lollowing lorm: 

Ficton raaponalbit tor tht obiarved Incraaaa Include wNch of 
ttitlollOwInQ? 

L Higher pr#iiura 
IL Htflhar temperature 
ML Lower humidity 
IV. Lower wind ip«ed 

(A) lonly (B) landtlonly (C) IllandlVonly 
(D) I. IL and III only (E) IJI,tlt,andlV 

Ench Ad\/ancad Teat is gwen in 170 mioutes^ All the tests are 
designed to ba power rather than speed tests; consequently, the 
QuideMne has been adopted that \/irtually all examinees should 
compteta three-fOurths ot the questions and about 60 percent 
Should finish the test. Other indicators of possible speededness 
that are found and studied for each teat edition include the means 
and variancas of the number of questions omitted and the number 
ol quastiona not reached and P^^ts of the scores as a function of the 
number of right plus wrong reaponses^ The number of questions in 
the tests >/ariea from 66(2.58 minutes per question) in the Advanced 
Mathematics Test to 230 (0J4 minute per question) in the Advanced 
Literature in English Test^ In the tiore quantitative fields where the 
test questions sometimes require preliminary figuring before an 
answer can be selected, mora time is needed per question^ A com- 
plete listing of the number of questions and the average testrng 
time per questron for each of the 20 tests is given in Table 9- 



Table 9: Number of Questions In GRE Advanced 
Tests and Average Testing Time per Question 
<Each test Is given in 170 minutes.) 

Aivnf* T4*Mnl Tim* 
HupriMr ot p«r Qunt^ In 



Committees of Examiners 

For each GRE Advanced Test there is a committee of examiners 
composed usually of five scholars in the discipline of the lest. Some 
committees have more than five members on occastont and some 
have fewer than live much rarer occasionst but five is the typical 
number. One of these scholars serves as the chairperson of the 
committee. 

The commtttss memt>ers or examiners are generatly appointed 
for two- year terms. Members may be reappointed for any number of 
terms. In general, the membership of each committee changes 
gradually every two yearst It is typical for One or two examiners to 
leave a committee and be replaced by one or two scholars new lo 
the committee every other year^ Hence, a typical length ol service 
for a given committee member is four to eight years. The advantage 
ot this gradual change in the membership of the committee is that it 
provides continuity and experience on the one hand and Iresh 
insightsand approaches On the other. 

The scholars who serve on a committee of examiners are almost 
always college and university Prolessors in the discipline of the 
tes*. The few exceptions to this pattern invariably arise because an 
examiner who was a professor at the time of the initial appointment 
leaves the professional ranks for some other assignment during the 
term of the appointment. The examiners tend to be memt>ers of 
graduate school faculties at universities with ^^rge graduate 
schools and ^^9^ quality programs. This tendency arises from the 
sensible assumptiorr that academicians of this kind are more 
closely concerned than anyone else with the selection of students 
for graduate study. However, the Advanced Tests are predictors of 
success in graduate study because they are measures of achieve- 
ment in undergraduate study. Therefore, it is reasonable to have 
soma professors who teach at the undergraduate level represented 
on the committees. 

Disciplines differ in scope and homogeneity. Some are so broad 
they include severat divisions that have themselves become vir- 
tually separate discipJineSt Examples include zoology tn biology, 
eiectrical engineering in engineering. Amerfcan history in history^ 
and Spanish American literature in Spanish, An important 
consideration in constituting a committee is adequate representa- 
tion of the important divisions that exist in the discipline. Thus, the 
chemistry committee includes specialists in analytical, inorganic, 
organic, and physical chemistry, and the physics committee in^ 
eludes high*energy. nuclear, and soiid-state physicists. 

Another consideration in the appointment of committee 
members is geographical representation. In an increasingly mobiJe 
society in which a person may be born and reared in one region, do 
undergraduate work in a second, do graduate work in a thirds and 
go on to become proiessor \r\ a f ourtht geographical rapresefilation 
is not an overriding consideration^ However, this characteristic is a 
very easy one to ascertain, and an attempt is made to represent dif- 
ferent regions ol the country on each committee. 

There is considerable concern, too. about fairly representing 
the ir^lerests of blacks, other minorities, and women on Advanced 
Test committees. Efforts to include at least one racial minority 
member and at least one woman on every committee have met with 
success. 

The examiners are appointed by Educational Testing Service with 
the cooperation and assistance of a leading learned society (or 
societies) in the discipline of the test. The typical practice is for ETS 
to submit the names of several scholars under consideration lor a 



Biology 


210 


0.81 


Chemistry 


150 


L13 


Computer Science 


80 


2.12 


Economics 


160 


1.06 


Education 


200 


0.85 


Engineenr>g 


150 


1.13 


French 


190 


0.90 


Geography 


200 


0.85 


Geology 


200 


0.85 


German 


210 


0.81 


History 


!90 


0.90 


Literature rn Er>gliSh 


230 


074 


Mathematics 


66 


2.58 


Musjc 


200 


0.85 


Philosophy 


160 


1.06 


Physics 


90 


L89 


Political Science 


170 


1.00 


Psychology 


200 


0.85 


Sociology 


200 


0.86 


Spanish 


210 


0.81 



26 



committee \o the executive secretary or president ol the relevant 
society. The society officer is asked to comment on Ihe schOJars 
under consideration and to suggest addtl tonal scholars. The 
scholars considered for committee membership come to atlontion 
through a variety of channels. A scholar's writings in professional 
journals or speeches at professional meetings may indicate special 
qualifications; or an interest m sending as an examiner, Someilmes 
scholars are suggested by current examiners who know what is in- 
volved in committee worlt and hence can recognize which of their 
colleagues may be especially suited tor service on a committee 
Sc>metimes a number of professors are mvrted to wnte questions 
for a te^Jt: those who are esPeciatly successfut may later be ap- 
pointed to a committee of examiners. 

Each committee of exammers reviews and approv^os the 
specifications for the test for whtch il is responsible^ writes, re- 
views, selects^ revises, and approves questions for the test; and re- 
views and approves new editjons of the test. Much of th^s work is 
carried out by mail, but tyPicaily each committee meets once for 
each new edition of a test that ^s developed TTius. 12 committees 
meetanniralJy, and 8 meet brenn^aHy 

Seii'Crai Characteristics of the Advanced T<*sts must rematn fixed 
and can be changed only gradually. These stable characteristics 
reflect the fact that the GRE Advanced Tests are Part of an ongoing 
testing program. Por reasons of fairness, several edttions of each 
Advanced Test must be available for administration each year. To 
be useful, these different editions must yield scores on iho same 
scale. For the Advanced Tests, this means that each new edition 
must contain some questions from prior editions. It also means that 
the test specifTcattons cannot change abruptly but must evolve over 
a penod of time. Since atl 20 Advanced Tests are given in the same 
tZOminute penod in the same rooms^ aff tests must conform to the 
same administration modO. Fmally. for reasons of effective mea- 
surement—that is. to provide measures of high vaJidjty and reli- 
ability witf. economy of time, money, and effort — the lest questions 
must be of the muUiple-choice tyPe Within this framework, the 
committees of examiners are free to exerCise their judgment and 
creative skMfs m assessmg the competencies of theej^aminees 

Content Spedflcations 

Wfien a new tesf rs bemg rntroduced. \he First fask js fo determine 
the test s future content and to set specifications A number of 
problems must be faced and solutions devised. For example, com- 
mittee members must g^^appfe with such questions as: What are the 
major subareas of study withm the field'^ Which are most im- 
portant'? With whfCh subareas are enough students familiar to 
make the topics ^n those areas a reasonable focus of measure- 
ment*? How can balance {knowledge and application, for example) 
among subareas and among skill Oimensrons be attained*^ If there 
are professional and academic tracks m a field of graduate study^ 
which should be dealt with? Or can they be reconciled and both in- 
cluded*? Is there a core of matenal bas»c to the study of the dis- 
cipline or are specialized subfields relatively independent'^ Often 
no answer that serves the purposes of testing is futly satisfactory, 
bf.it a consensus must be reached to provide a framework for future 
test development 

The first step in develOpmg a new test edition after a test has 
been introduced is a review of tesr specificarions f^or many of the 
tests, two aspecrit of the specihcalions—^ubiect matter to 
covered and abililios to be measured — btend into a single dimen- 



siOn. By the lime the examiners have satisfied themselves that a 
question requires the demonstration of a capability they believe to 
be a significant one for a graduate student in their discipline to 
possess^ it IS probably superfluous to inquire further into what 
name mrghl be appropnately attached to the ability needed to 
answer the question. Hence, lor many tests the specifications are 
set purely in terms of the subject areas of the discipline, with indica- 
tions as to the number of questions to be included in each category 
of that content. For a few tests, the content and abilities are treated 
as two dimensions. Some attention is given to ascertaining, for 
example, rf a questior: requires recall of information. appticatrOn of 
information to the solution of a new problem, or analysis of a given 
body of informationr 

At each meeting of the committee, the test specifications are 
likely to be reviewed. The specifications agreed upon wiM guide test 
development until the next revision in the specifications^ The 
practical impfrcatrOn of this procedure rs that the specrftcatfons 
agreed upon at the first meeting are liltely to guide development of 
the test prepared at a second meeting. Then the specifications 
agreed upon at the second meeting wHf guide development of the 
test prepared at a third meeting, and so on. The review of specifica- 
tions focuses on such questions as: Has the field or the under- 
graduate curriculum changed ^n Significant ways that wiJI affect 
student knowledge? The newesi trends in a field may not. of 
course, have yet had any effect on most curricula. Thus, the conv 
mittee must consider the experience of the majority of students 
rather than the activities of a vanguard in the discipline. 

One way of determining the appropriate content of the test is for 
committee members who have direct experience witfi students and 
with the teaching profession in the field to pool their knowledge of 
students' common experiences. Another method. Particularly in 
cases wfiere differences of opinion may exists is to obtain informa- 
tion directly frOm the students taking the test. Periodically^ students 
taking the Advanced Tests are asked to answer questions about 
their undergraduate background as well as their educational level 
(senior, graduate, etc.) and goat (master'!^. Ph.D.. etc.). Most ques* 
tions ettcit information on specific courses taken or areas; of 
concentration. (The answers to such Questions for most of the Ad- 
vanced Tests at 1970-71 test administrations are presented in Ap- 
pendix II.) On occasion, more extensive questionnaires are dis- 
tributed to examinees at t^st centers following the test; the 
examinees answer those questionnaires at home ar>d mail their 
responses to ETS. 

Whenever department heads or faculty members review a 
confidential inspection copy of a test, they are asked to complete a 
test evaluation form, expressing their judgments of the appropriate- 
ness of the lest and of specific test questions for particular pur- 
poses. In the late 1960s and early 1970s, panels of facutty members 
at a number of institutions were systematically identified and asked 
to complete more detailed test evaluation forms, Sev^eral professors 
devoted many hours to these test reviews, and their collective judg- 
ments on test and question appropriateness proved valuabfe in de- 
termining test content 

Consultants may join the committees of examiners at their meet- 
mgs occasionafty. Often the consultant is an officer in the relevant 
professional association. Again, the discussion with the consultant 
IS fikety to focus ort ttie appropriateness of the test content. From 
Nme to time, a pane/ of educators in a fie/d may be convened to 
cv:i!uate the test and make recommendations, or rhe faculty \o an 
undergraduate^ field may be surveyed tor reactions to teal content 

27 



Table 10: Statistical Characteristics of the Advanced Test Total Scores' 



ttct 


EunlMC* 
Sc«M tan* 




St(n»r« em 
Dull Atm 


Olffinllr 


fltii«yMir 




SUndtflt 


HllMtt 


HlflHft 




P«re*fltt|t 
f« 

Aibtmrifif 




fimr tt 

fMflt «f 

Tot»t 
U*t*6 


PtfOfltlft of 


H of tttVl 




Biology 


4.001 


640 


no 


9B0 


990(1.060) 


260 


lp300 


55% 


.93 


28 


100% 




Chemiitry 


930 


652 


109 


990 (UOlO)** 


990(1.140) 


440 


930 


39 


.93 


29 


99 


30 


Computer 




























485 


637 


108 


$70 


920 


390 


485 


55 


.93 


28 


100 


54 




600 


615 


H4 


960 


990(1.0201 


400 


600 


43 


.95 


25 


9G 


70 


Cduulioo 


2.164 


45S 


91 


700 


810 


220 


2.160 


49 


,94 


22 


99 




engineering 


1.310 


614 


109 


910 


990(1.010) 


320 


Ip3l0 


50 


-W 


27 


98 


SB 


French 


190 


519 


S8 


770 


810 


290 


190 


53 


.96 


19 








259 


469 


91 


690 


850 


210 


255 


50 


.93 


24 


100 


84 


Geology 


765 


590 


94 


050 


910 


300 


765 


56 


.94 


22 


99 


75 


German 


320 


527 


100 


760 


810 


290 


320 


5/ 


.96 


19 


100 


73 


H tit cry 


865 


520 


81 


760 


870 


330 


885 


45 


,94 


20 


99 


89 


Uterature in 


























Cngliih 


1,378 


548 


101 


800 


810 


250 


1.375 


60 


.97 


18 


100 


82 


Mathenulici 


993 


707 


143 


990 ( 1.060) 


990 (1.060) 


420 


990 


52 


.93 


38 


100 


68 




jot 


508 


96 


760 


820 


270 


580 


52 


.96 


19 


99 


64 


Philosophy 


225 


660 


lis 


960 


990(1.070) 


380 


225 


48 


.94 


29 


97 


82 




965 


657 


136 


990 (1,090) 


990(t.2l0) 


370 


965 


42 


.89 


45 


99 


32 


Political 


























Science 


625 


491 


S4 


680 


850 


250 


625 


50 


,92 


24 


99 


91 


Psych oJoD 


3.348 


550 


90 


310 


940 


270 


1,675 


51 


.93 


25 


100 


58 


Socfology 


429 


4S4 


119 


780 


990(1.000) 


210 


425 


44 


.94 


29 


99 


81 




227 


550 


t06 


820 


910 


290 


225 


53 


.95 


23 


100 


92 



Edition* introduced in 1976 (1974 for Geography and «TM* *> th« standard dwiatlon of th« %c<tn> an «Hamlnee would cam 



iValuH fo 
German). 

^Th* numben in perenthKei repmenl icaled scored that wOuld hava 
b««n «arn«d If th« ical* extended beyond 990. Scaled icorei hiSn«r 
then 990 are reported as 990. 

^E^timet^d by th* Kuder-Rictiardion formula (20> adapted for uie with 
formula icotlnS. 



«TM* *i th« standard deviation of the >core« an eHamlnee would earn 
If takrns the tett repeetedty. e»umln8 that facion »uch a* fetlflue 
were eUmHable. 

-'Ai tft noted on page 29t the percentage flnrthjnfi the entire tett m»y 
be mi»le«dlng %\ncm the number reaching the Jait question may differ 
sifinlficantty from the number reaching the next^to-lait question. 



Statistical Specifications and Characteristics 

The statistical specifjcations cal* for me Advanced Tests to be of 
middle difficulty, maximum reliability, and minimum speededness 
wuhm the time constraints. Data on the difficulty- fdiiability. and 
speedeiJness of the edilrons of each of the 20 tests introduced iri 
1976 (1974 for Geography and German) are given rn Table 10- The 
data are obtained for each test edition introduced. 

For maximum effectiveness in guiding admrssion decisions, the 
difficulties of qu^tfons should be such that about half the students 
vwtio are at the dividing Jine between admissron and rejection 
answer the questions correctly. If there were only one graduate 
school and the ability level of students who just qualified for ad* 
mission to that graduate schooi could be established, then the test 
questions could be ptfched at the ideal level of difficulty for making 
admission decisions at ttiat school Since there are hundreds of 
gcaduale schools, a reasonable alternative is to construct tests 
containing questions with a range of difhcufties. but wrth an 
average question of such difficulty that half the examinees who 
respond to the question get it right. 

In actuality, tJie Advanced Tests tend to be more difficult than the 
specifications recommended. For only 13 of the test editions in Ta- 
ble 10 is the average percentage of examinees answering questions 
correctly 50 percent or greater. One could toter that. ?or a five* 
choice question the answe* lo which is known by naif the 
exammees and is guessed by the other half, rhe most Itkeiy 

28 

o 



percentage choosing the correct answer Mould be 60 percent. This 
is used as a reasonable slandered of middte difficulty. Onfy one test 
is easy enough that the average percentage of examinees correctly 
answering the questions is 60 percent. The difficult questions are 
especially effective in distinguishing among higher^scoring exam- 
mees. whereas easy questions are especially effective in dLSl[n^ 
guishing among lower-scodng examinees. 

If tests scores are *o have value, they must possess a high degree 
of reliabifity. Reliability coefficients can range from 0.00 to 1 00, and 
meaning can t>e attached to them in several different ways. The 
method of deterrr^ining refiability is discussed i^ Chapter 5. For the 
purposes of discussion here, reliability will be considered a statis- 
tical indicator of the tendency of a test to measure consistently 
from one time to another, 

Tne retiabtUties ot the Advanced Tests atmost always exceed .90. 
Such high reliabilities are desirable because the decisions being 
made partly on the basis of the tests are significant ones with 
considerable impact on people s lives. All the reliabilities in Table 
10 equal or exceed .69. with six equal to or greater than .95. The 
standard errors Of measurement cange from ts to 45 scaled score 
points, with a median of 25. Only two tests have standard errors of 
measurement greater than 29; they are the Advanced Mathematics 
and Physics Tests, which have relatively few questions. The Ad- 
vanced Mathematics Test has 66 questions and the Advanced 
Physics Test 90. The nature of these fields, however, requires quite 



1^ u 



time-consuming questions. Therd is a reluctance \o increase the 
number oF quealfons because th^s m^ghl shin the measurement ern- 
Phasis away from problem-solving abiHty toward the recalt of facts. 
\r\ January 1978 the committee of exammers for the Advanced 
Phy§\C9 Test decided to raise the number of questions \rt the test to 
100 to increase the reliability and decrease the standard error of 
measurement. 

The Advanced Teete are intended to be of suc^i a Jength (hat most 
examinees will have time 1o consider most, if not aU. of the ques- 
tions. Two speededness indicators are presented in Table 10; the 
percentage of students completing three^fourlhs of the test and the 
percentage anewering the last question. Of the two sPeededness in- 
dicators given in Table 10. the percentage completing three-fourths 
ot the test is the more reliable indicator. The percentage complet- 
ing the test depends entirely on the number answering the last 
question. Often there is quite a large difference between the 
percentage reaching the next^to-la^t question and the percentage 
reachirtg the last (Question. One set of standards sometimes taken 
to indicate that a teet <s a power test and iacks any signiticar^t speed 
factor is that virtually all examinees reach three-fourths of the ques- 
tions and dO percent reach the last question, if ' virtualiy all ' is 
defined as $d percent to 100 percent, then onfy three tests dtd not 
meet this standard. Although the tests are clearly designed to be 
power rather than speed tests, ttiere 's a strong desire to keep the 



number of questions very close to the level ihat ter^ds to produce in* 
drcations of sPeededness. The reason <s that reriabiMty depends on 
the number ol questions included in the test Observations show 
that examinees are quile variable in the rapidity with which they 
work. If the number Of questions were reduced to allow at ieast 80 
percent to complete every test, m'jch useful information would be 
sacrificed, and the test reliabilities would suffer accordingly. 

In the assembly of most Advanced Test editions, computed 
statistical information is available only for the equating questions 
and a few other questions, if the new edition is being equated 
through one prior edition, statistical information will be available 
for about 20 percent of the Questions. It it is being equated through 
two prior editions, as is often done, information will ba available for 
about 40 percent of the questions. New questionsare not pretested; 
thus their statistical Characteristics must be estimated. 

Pretesting of questions for the Advanced Tests was introduced at 
one time but was later discontinued. Pretesting does permit 
construction of tests whose actual characteristics will more cloeely 
meet specifications than otherwise. However, it has proved poissi- 
bie to assemble Advanced Tests with fully adequate statistical 
Characteristics without pretesting new questions. (See the dis- 
cussion of pretesting in Chapter 2.) 

Two or three subacores are reported for nine Advanced Tests. 
The subscores are inlended to provide information useful to 



Table 11: Statistical Characteristics of the Advanced Test Subscores'*' 







Hufnbtr of 




HHmWr of 

In StmplM 
tn wMzh 

D«t» Art 


OlMtmlir 




J|ii«w*rliit 


RaIIiUIiV 
of 


tmt ot 
ftttMirt<n*flt 
of Stib*coit 


Stindird 
m^tn Devotion 


Biology 


Cellular & Subcellular 


4.001 


64 




h300 


53% 


,88 


3.8 




Organismal 




64 






53 


,82 


4.7 




Population 




64 






59 


.85 


4 2 


Engineering 


Engineertng 


1.310 


61 




1.310 


41 


.86 


4.1 




Mathematics Usage 




61 






60 


.92 


3.0 


French 


Interpretive Readmg Skills 


190 


52 


9 


190 


63 


.93 


2.4 




Lfterature & Civilization 




52 


9 




43 


.92 


2,5 


Geography 


Human 


255 


47 


9 


255 


53 


.89 


3.0 




Phystcal 




47 


9 




45 




3.1 


Geology 


Stratigraphy, Paleontology, 


765 


59 


9 


765 


58 


.84 


3.7 




and Geomorphology 


















StructuraJ Georogy and 




59 


9 




56 


.67 


3.4 




Geophysics 


















Mineralogy. Petrology. 




59 


9 




53 


.88 


3.3 




and Geochemistry 
















H.Story 


European 


865 


52 


e 


865 


45 


92 


2.3 




American 




52 


6 




44 


.85 


3.1 


Music 


Theory 


532 


51 


10 


580 


58 


90 


3.1 




History 




51 


10 




48 


.94 


2.3 


Psychology 


Experimental 


3.348 


55 


9 


1.675 


47 


.87 


3.3 




Sociaf 




55 


9 




56 


.85 


3.5 


Spanish 


Interpretive Reading Skills 


227 


55 


1] 


225 


54 


.88 


3.8 




P^nmsular Topics 




55 


u 




51 


.89 


3.7 




5lnan*Sh American ToPtcs 




55 


u 




46 


85 


4.2 



*VjilLi« lor edition* intro'tijced i"^ 1^76 H^Ia f^r GeciSraPhy) 



stud4int$ in aasdftsing Chetr strengths and weaknesses and useful to 
mstitottons in guiding and placing students. Subscores are re- 
ported if the committee of examiners for a test identifies sub- 
scores ludged to serve theso purposes and if the rehabihties of the 
subscoreft exceed .80. The number of subscores represents a com- 
promise between the desire to report many subscores tomorefuliy 
describe the student's achievement and the desire to report sub- 
scores of sufficient reliability to be used with confidence^ The reh- 
abiliry ol a subscore depends in part on the number of questions 
contributing to it. As the number of subscores increasesn the 
number of questions contributing to a subscore and hence the re- 
Oabiiity of that subscore tend to decrease. Since subscores tend to 
have lower retiabiJities than total scores, they are more appro- 
priately used for placement than admission decisions. Data on sub- 
score perlormance. the average difficulty of questions contributing 
to each subscore^ and the rehability of subscores are gWen in Table 
1 1. An but two of the subscores have reliabilities at or above ^5. 

Table 12 Shows the correlations of the subscores with each other 
and with the total scores. The correiatfOns (between .52 and .82) in- 
dicate that the subscores are sutficionfly independent to be useful 



in providing more specific informalion about the student's relative 
achievement in the subdivisions of the field. 7'he correlations 
between subscores and total scores are spuriously high because 
the questions in each subscore make up a substantial portion^ 
sometimes more than hatf ► of the questions contributing to the total 
score. 

Most students who take Advanced Tests also take the GRE Ap- 
titude Test. Correlations between verbal and quantitatwe abitity 
SCO es on the Aptitude Test and Advanced Test scores are shown in 
Triole 13. These data are from 1967-68^ the last time correlations 
between Advanced Test and Aptitude Test scores were calculated. 
The median correlation between Advanced Test score and verbal 
ability score was .63, and the median correlation between Ad- 
vafnced Test score and quantitative ability score was .52. These cor- 
relations suggest that the Advanced Tests measure domains 
substantially independent of the verbal and quantitative ability 
measures of the Aptitude Test.* 



*Corf«l«tioaB b«twMri Mvwt<;«d T«al acof^ atid th^ fAsiruclursd AQtituOa Test gcore^n 
including th« ftn^lytical ttCiitily 9co<^- ^Qie not dvalmbi* at th« hma ol mttnual^a 
publiCAtiOn 



Table 12: Corretattons among Scores on the Advanced Tests 
lor which Subscores Are Reported 



T«t 


Sc«r4 


Totil 
S«on 




^ ID 




I9J 




lOtai &corc 


1 

i .Uv 


84 to 87 


.89 to 92 


.85 to 87 




(1) CeMufar and SubcelJular Biology 


.84 to .87 


LOO 


.64 to .68 


.52 to .58 




(2) Orgartismal Siology 


89 to .92 


.64 10 68 


1.00 


.70 to .73 




(3) Population Biology 


.85 to .87 


52 to .58 


.70 to .73 


1.00 


ENGINEERING: 


Total 3core 


i.OO 


.90 to .91 


.91 to .92 






(1) Engineering 


90 to 91 


1.00 


.6S to .67 






(2) Mathematics Usage 


.91 to .92 


65 to .67 


1.00 




TRENCH: 


TOtflt Score 


LOO 


,93 to .94 


.92 to 93 






(1) Interpretive Reading SkilK 


.93 to .94 


1.00 


.72 to .74 






(2} Literature and Civilisation 


.92 to 93 


.72 to .74 


1.00 




GEOGRAPHY; 


TrjtaJ Score 


1.00 


92 to .96 


as to .88 






(i) Human Geography 


.92 to .96 


1,00 


,63 to .68 






(2) Physical GeograPhy 


.85 to 88 


.63 to .68 


LOO 




GEOLOGY: 


Total Score 


1.00 


.89 


.90 to .91 


.86 to .89 




<1) Stratigraphy, Paleontology H 












Geomorphology 


,89 


1.00 


.71 to .73 


.63 to .70 




(2) Structural Geology^ GeophystCs 


.90 to 91 


71 to .73 


1.00 


,68 to .70 




(3) Mtneraiogyn Petrology, Geochemrstry 


.86 to .89 


.63 to .70 


.68 to .70 


1.00 


HISTORY: 


Totat Score 


1.00 


.9S to .97 


.89 to .90 






(1) European History 


95 to .97 


1.00 


.72 to ,76 






(2) American History 


.89 to .90 


.72 to .76 


i:oo 




MUSIC: 


TotaJ Score 


1.00 


.92 to .93 


.96 to ,97 






(1) Theory of Music 


.92 to .93 


1.00 


,77 to 82 






(2) History of Music 


96 to .97 


.77 to 82 


1.00 




PSYCHOLOGY: 


Total Score 


1.00 


^90 to .93 


90 to .92 






<1) Experimental Psychology 


,90 to .93 


1.00 


.67 to .73 






(2) Social Psychology 


90 to .92 


.67 to .73 


1.00 




SPANJSH; 


Total Score 


1.00 


.91 to .93 


,87 to .88 


,80 to .85 




(]) Interpretive Reaching Skills 


.91 to ,93 


1.00 


.69 to .70 


.63 to 72 




(2) Peninsular Topics 


.87 to .88 


.69 to .70 


1.00 


.63 to .65 




(3) Spanish-American Topics 


.80 to .85 


.63 to .72 


.63 to .65 


1.00 



30 



Tests with the highest correlations with quantitative ability scores 
v/ere. in order. Sociology, Economfcs^ Mathematics, Engineerrngn 
Biology, and Chdrnistry. Tho significance of these correlations is 
discussed m Chapter 6 in felatior> to the construcl vafiditv of the Ap- 
titude Test 

Information Unique to Each Advanced Test 

Several important categories ol intormalloa about each of the Ad- 
vanced Tests are treated in Appendix IL which provides for each Ad- 
vanced Testn where appropriate and avajlabten the foWowing: 

1 A description of the (est's content, specilications, and specific 
problems associated with determining the test contend 

2. Responses of students to questions about undergraduate back- 
ground in the discipline. 

3. Reports of validity studies or other studies invo/vfngthe t^st. 

Table 1^ provides a historical perspective on each Advanced Test 
oPfering t»v i»sfing tests available in 1956-57, in Is^ee^-eA and 'n 
1976-77, with indications of changes that took place in the decades 
baiweeri. 



Reference 

Graduate Record Examinations Board and Council of Graduate 
Schools in the United States. Graduate Programs antf Ad- 
mrss^ons lUfanuaA 1976-77. Princeton- N.J.: Educational Testing 
Service, 7fl76 



Table 14: Advanced Tests Available 







21 Intt in 1966 «7 


Clufitn tfi 1967 1976 


luti tfi 197S,77 




Business, Geography. 


Biology 


Business^ Physic*/ Education, and 


Biology 


Ciiemi^try 


Music. Physical fduca 


Busir^ess 


Speech were dropped: Computer 


Cfiemistry 


Ecortomit^ 


tion. and Speeth were 


Chemistry 


Science and German w«re added: 


Computer Science 


Education 


added. 


Economics 


Anthrppotogy vfAs added, then dropped. 


Economics 


Engineering 




Education 




Education 


French 




fngineermg 


The name of Govemment**^as 


Engineering 


Geology 




French 


changed to Political Science, and the 


French 


Government 




Geography 


name of Literature vvas changed to 


Geography 


History 




Geology 


Literature in English. 


Geotogy 


Literature 




Government 




German 


Matfiematics 




History 




History 


Phi1os:>phV 




Literature 




Literature in English 


Physics 




Mathematics 




Ma them at res 


Psycho la ev 




Music 




Music 


Sociolo^ 




Phifosophy 




Philosophy 


Spanish 




Physical Educahon 




Physics 






Physics 




Political Science 






Psychotogy 




Psychology 






Sociology 




Sociology 






Spamsh 




Spanish 


^ 


1 


Speech 







Tabte 13: Correlations of Advanced Test Scores 
with Aptitude Test Snores. 1967-68 







Cafm«tUifii lAhmn; 














i«t «p4 










QulfiNtitlv* 


Ittt 




nBIM^ KQrn 




Biology 


4.696 


.66 


60 


Lncmtjiry 




SO 


.58 


Comouler Sf^ipnrf> 


{No test in 1967 68) 






Econornic s 


1,930 


66 


.65 


Education 


2746 


71 


.46 


Engineering 


4,259 


49 


61 


French 




69 


.41 


Geography 


306 


.53 


.54 


Geology 


575 


51 


.50 


German 


(No lest in 1967 68) 






History 


4,919 


61 


.40 


Literature tn Fnolt^h 


6,276 


75 


40 


Mathematics 


3.279 


55 


.65 


Mustc 


647 


60 


54 


Philpwphy' 


793 


69 


50 


Physics 


2.190 


1 "^6 


.55 


Political Science 


2745 


65 


.50 


Psychology 


5.643 


67 


50 


Sociology 


2,151 


78 


.67 


Spanish 


770 


39 


.18 



The SIX Advanceo Tests with the highest correlations with verbal 
ability scores were, m order. Sociology, Literature in English, 
Educaifon- French, Philosophy, and Psychotogy. The sm Advanced 



Chapter 5 

STATISTICAL METHODS AND ANALYSES OF 
THE GRADUATE RECORD EXAMINATIONS 



Th© psychometric problems oi the Graduate Record Examinatior>s 
Program are simtler to those fouhd in af>y teslmg progranri with ihe 
fotlowirvg characteristics: (d) the services offered are based ^ 
battery of tests rafher lUan on a single test: (b) the tesis are 
admintsiered more ihan once a year: {c) they are administered over 
an extended period of years: and (d) the complex nalure or the 
services inchides providing information used in making decisions 
that have a long-term effect On individuals and are importani lor in< 
stitutions. 

In a lesting program in which only one test and One administra- 
lion are invoJved. the score scale may be defined in any convenient 
and arbitrary fashion. For e^tample. the score scale for grading a 
final examination prepared for evaluating one class of students is 
selected by the person who does the evaluating and adequately 
serves the dual purposes of ranking the students in that class and 
estabiishio^ an acceptable i^vef of perfOr.Tiance^ No Further use of 
the scale >s anticipated. 

However, when more than one test or score is invoJved. \X may be 
desirable to introduce some Kind of linkage that binds each test 
scale to the olh^rs in a manner that will facilitaie score rnterpreta* 
tion^ Up unti^ 1^77. the Aptitude Test provided two basic measures 
of academiCr potential that have been widely used in admissions de- 
cisions. It is obvious that reporting the verbal and quantitative 
ability scores on a single scale was a necessary convenience. It is 
equally obvious that the new analytical ability measure, introduced 
as part of the Aptitude Test in the fall of 1977. be reported on that 
same scale. In the case of the Advanced Tesls. however, it is not so 
obvious, but the early years ol experience with the development of 
Ihe Graduate Record Examinations showed that there were de- 
cided advantages in having a scale structure that would reflect the 
reialive ability levels of sludents who elect the various major fields^ 
This is the kind of 3calt#d-score system currentJy used in ihe GRE 
Program. 

A testing program mat involves multiple admin*straiions wilhin a 
relaiively short period of lime must provide alternate forms of each 
test and comparability of scores across test forms. Each alternate 
form must satisfy lo a high degree all Ihe requirements of paral- 
lelism in both content ar>d statistical characterislics wilh the form it 
is lo replace. Because alternate Forms win inevitably differ 
somewhat in difficulty levei. some statistical adjustments must be 
made in the score conversion to make the scores reported on the 
two forms directly comparable. This adjustment is accomplished 
through the statistical procedure of equating. 

When a test is administered Frequenlly over an extended period 
of time, ii becomes necessary to insure the stability of the scale so 
lhat the experience gained fn interpreting scores earned in pre* 
vious years can be applied to interpreting scores earned in the cur- 
rent year. This requires ihat test forms be indeed perallei so as to 
minimize the siatistical errors of equating. At the same time, 
however, there is a psychometric need to allow for revisions in 
conteni thai reflect advancements in knowledge and changing cur- 
ricutar emphasis within a field. The conflict t>etween the statisficai 
requiremenis for scale stability and the psychometric requirements 



for test vitality is resolved by permitting gradual conleni revision 
combined with tight statistical control of the scale. 

Allhough scaling and equating procedures provide for consis- 
tency 'in reporting scores, the Interp relation of those scores must 
be enhanced by Information on the ranking of each score among all 
oiher scores for some meaningful group. Therefore, normative in- 
formation is provided along with aiher Interpretive data to help 
studenis and graduate school repnesentatlv'es compare the perfor* 
mance of an individti«l-wUtLtbaH^ others. The Q\^ide to the Use ot 
the GTaduate Record f xam/naf/ons provides three sets of norma^ 
tive or interpreiive tables. The first set provides percentile ranks for 
selected scaled scores for the total GFlE population taking a 
particular te^t within a recent three-year periotf. The second set 
provides Ihe same kind of Information for recent GRE National 
Program seniors and nonenroiled collegegraduates (the typical ap- 
plicants for admission to graduate schools, approximately 60 
percent of the total GRE population) who have taken the Aptitude 
Test and may have taken an Advanced Test. The third sett which is 
based on this same groups provides Aptitude Test score distribu^ 
tions based on the classification of the examinees by intended 
graduate major field. (As data are accumulated for the restructured 
Aptilude Test, first adminislered in October 1977. Ihey will be 
presented in succeeding issues of the Guide.) 

Development of the GRE Scaled- Score System 

The scaied^score system used in the GRE Program defines a scaled 
score of 500 as the mean of the score distribution for the particular 
standardization or reference group on which the scale Is based and 
100 as the standard deviaiion of lhat disiribution. Scores are 
reported as three*digil scores ending In zero and havir>g a 
maximum permitted range of 200 to 900 for the Aptitude Test and 
200 to 990 for the Advanced Tests. Prior to 1952 each test was 
scale^ independently of the others I^V settir>g the mean of the group 
that took the tesi equal to 500 and the standard deviaiion equal to 
100. Shortly before t952. the Advanced Tests were extensively 
revised and the allotted lasting time was extended from one hour 
and forty-five minutes to three hours. As a result of these changes 
in the tests and changes in ihe GRE population, a decision was 
made to reScale \he tests and to recognize the advantages of 
changing the type of scaled-score sysiem used. 

Students majoring in different fields generally exhibit different 
tevels and ranges of aptitude development, tn the redes^n of the 
scaled^core system, these differences were to be taken into ac- 
count and incorporated in the scales for the individual Advanced 
Tests The data for the rescaiing were collected in the spring of 
1952 and consisted of scores earned by 2.095 graduating seniors in 
11 coKeges. This group was considered at that time to be 
reasonably representative of the GRE senior population, and. 
therefore, a scale system based on their performance would have 
r>ormative properties useful in the interpretation of scores obtained 
i)y other groups in subsequent administrations. 

Each studeni in the 1952 scaling group took the Aptilude Test 



32 



40 



and also tho Advanced Test appropriate to his or her major field 
Tho scale for each score {verbal ability ar>d quantitative ability) on 
the Aptitude Test was estsbtiahed by sothng the total-group mean 
equal to 500 and the standard deviation equal to 100. This process. 
W.StCh preserved the rank order of the students, resumed in a linoar 
transformation for converting raw scores to scaJed scores. 

For the scahng of the Advanced Tests, a more complex statistical 
procedure was employed. F^r each Advanced Test subgroup, 
regression coefficients were determined for predicting Advanced 
Test scores fTOm the verbal and quantitaiive ability scores on the 
Aptitude Test. Estimations were then made of the raw-score mean 
and standard deviation of (he entire standardizalton group on that 
Advanced Test. The esf/Vnafed mean was then set equaJ to 500 and 
the estifTjated standard deviation equal to 100. The equations for 
estimating the mean and standard deviation were developed by 
Ledyard R Tucker and afe reprinted below from the arlicle wrftten 
by Schultz and Angoff (1956) describing the 19&2 seating ei^pen- 
ment. 

Nolation 

v,q ^ scaied scores on the verbaf and quar^titati^^e parts of the 
Aptitude Test 
X - raw scores on the Advanced Test 
t = entire standardization group (2.095 seniors) 
s ~ Advanced Test subgroup 
Mh f/^ - estimated mean and variance 
M. o-^ = obtained mean and variance 
b = regression coef<icient 
C, ^ covariance bet^'^en verbal ^nd quantitative 



Table 15: Scaled-Score Means and Standard j 
Deviations of the 1952 Standardization Group / 



The estimated mean end ine estimated variance for Ihe 2.095 
seniors in the standardizallon group are given by the equations: 



- 














f 1 






Vttbtl 


Qutnlititjn 


M 




Tatt 


n 


Ml 








Tilt \^ 




1— 












SubftbuP 




Mitn 




Mil** 


S.D. 


Mtin 


so/ 


Biology 


209 


486 


94 


499 


87 


495 


96 


Che mist rv 


180 


507 


96 


562 


99 


530 


101 


Economics 


239 


476 


89 


516 


99 


494 


97 


Education 


ISO 


438 


86 


434 


83 


446 


93 


EngineennR 


151 


454 


92 


570 


86 


497 


98 


French 


32 


520 


92 


453 


72 


533 


92 


Geoiogy 


35 


473 


92 


soo 


36 


488 


97 


German' 


10 


543 


69 


495 


79 






Historv 


181 


517 


93 


468 


80 


506 


97 


Literature 


239 


5$4 


98 


463 


86 


548 


99 


Mithem^ cs 


81 


508 


93 


bH7 


90 


542 


97 


Phil(»5Dphy 


31 


563 


96 


S21 


90 


549 


97 


Physics 


49 


53t 


105 


633 


79 


546 


101 


P(»liticdl 
















Science 


146 


498 


93 


485 


89 


496 


97 


Psircho1(»gy 


17J 


527 


94 


495 


86 


512 


96 


Sociology 


127 


482 


93 


447 


77 


474 


96 


Spanish 


34 


529 


102 


451 


83 


520 


99 


Entire Scaling 
















Group 


2,095 


500 


100 


500 


m 














1- 









V 



*The Gttrman iubfifoup was too sm^O lor m establishing th*- ^cfUr^ 

fO» Ihts Advanced Test A T^vnetS v^r^^ion of Ihe test w^is scaled m L^6^ 
in the llndOrS»adua(e Program 

[The information in the above ^abltj UTktfn frnni Sc^i-iit/ ^ir.rf Anf^nff 

(t9S&>.! 



M 



1 b,..j (M, 
2b,, .,b.,. (C.u 



CO 



The linear Iransformatron to obtain scaled scores (y) frorn raw 
scores (x) is defined by the equation: 



y ^ A X + 8. 



where 



100 



and 



B -500 AM, 



As can be seen in Tabia i5h the Advanced Test mean tor the 
subgroup that actually took that test reflects the aptitude level of 
that group. This k»nd of difference in Aptitude Test Performance 
was taken into account ir> establishing the scales for the Advanced 
Tests. 

As each new test was added to the GRE battery, it was placed on 
the GRE scale ir> a similar manner. Thescaling sample consisted of 
the first group of seniors who took the new test along wilh !he Ap- 
titude Test. The scale was estabhshed by using the relationships 
among the three scores, thus reflecting the aptitude ievei and range 
of the group tested. The foilowing tests were introduced after the 
1952 seating administration; 



Anthropology 
Business 

Computer Science 

Geography 

German 



Music 

Physical Education 
Speech 



{scaled tn 1968. withdrawn in 1971) 
(scaled in 1964, withdrawn in 1970) 
{scaled in 1976) 
(scaled in 1966) 

(scaled in 1969 in the Undergraduate Program 
as a Field Test; GRE counterpart equated to 
the UP tesi in 1970) 

(Scaled in 1964 for the GRE fnsfitutjonaf 
Program, added to the GRE National Program 
in 1965) 

(available in 1962^ extensively revised end re- 
scaled ir> 1965. withdrawn in 1970) 
(scaled in 1953. withdrawn in 1970) 



For each of these tests, the scaling process involved using the 
verbal and quantitative ability scores of the 9roup taking the test to 
estimate the Advanced Test performance of the originai 1952 stan- 
darOizaUon grouP and setting the estirrjated mean equal to 500 and 
the estimated standard deviation ePuai to 100, A significant dif- 
ference between the scaiing of these tests and the scaling of those 
included in the 1952 seating experiment iles in the fact that ^^'S' 
tionships among the verbal and Quantitative ability and Advanced 
Test scores are based on more recent populations. The Possible ef- 
feci this might have on Interpretation of the scores \t discussed in 
the section dealing with the rescallng study of 1967-68 (page 36 } 



Scaling of the Analytlcel Ability Measure 

The introduction of the new analytical ability measure in the GRE 
Aplitudft Test in October t977 Presented a news(?aiing Problem. If 
the analytical measure had been intro^*uced with the verbal ind 



33 



quantitattve measures m the original 1952 scaling admmiSiration. 
the analytical abtlity scores woutd have befin put on scale by setting 
XtiQ mean of the standardization group equal to 500 and the stan- 
dard deviation equai to 100. as wa3 done with (he verbaf ar>d quanti- 
tative abfhtv scores. Obviously- this procedure was not possible in 
1977 

The correlations of the analytical ability measure with thf verbal 
ahd Cluantitative abiMty measures are nearly equat and are rather 
high approximately 76 tor verbal and 74 for quantitative The 
method selected for scaling the analytical abitity score consisted of 
averaging the^ofbal and quantitative abidty score means and 
variances using /3-weights. as shown in the following formulas. 

Estimate of the analytical score mean 



M. 



and of the varrdnce 



t r-\ , 

and the means and variances are m scaled-score units The result- 
ing means and standard deviations for the Octotier 1977 scaling 
administration are. 503 and l26 for verbal ability. 525 and l33 for 
quantitative ability, and 513 and 129 for analytical abiUty. 

Scora Equating and Related Concerns 

The purpose of equating is to perm+t introduction of new forms of a 
test to replace old forms without losing comparability of fePOfted 
scores and long-term contrnuity of the establTShed score scale 
Among the conditions for sound equating are four of particular im- 
POnance the new form must be parallel to the old form: theequat- 
hng samples must be adeQuate in size ^nd must rep''i?sent the popu- 
lation for Which the test was designed, the conditions of test 
adnrunfstr^tion must be carefully controlled, and the equating 
method must be approprtate for the particufar equating experi- 
ment. In practice, however, compromises are sometimes necessary 
to accommodate other considerations governing the pohcies of the 
testl^^g program 

A rigid interpretation of the conditions of parallehsm would resuH 
tn Production of a test built with the same content specifications, 
the same number of test ttenis.' the same level and spread of 
difficulty, and the same reliability as the old form. Even wMh thisob- 
Jecl^ve governing test construction, the requirements for parai- 
ie*ism can never be precisely met. but the deviations are small and 
the statistical adjustments of the equating are adequate Advanced 
Test committees of examiners may from t^me to time reassess the 
content specifications and possibly make changes that reflect 
Char>ges m currtcuJa and m populations An examination of the 
content specifications of the old form may indicate that the 
difficulty level is ho longer appropriate for current grouos or that 



the lest length should be changed. Changes of this kind, howevern 
are purposely introduced graduatly to avoid a serious effect on the 
comparabitity of scores across forms, 

Jn the p/annmg of an equating expenmerrt. every reasonable ef- 
fort rs made to establish samples thai are targe and representative. 
Practical circumstances do occasionally necessitate compromise. 
Sample size is limited by the nature of the test and the number of 
examinees tested in one administration. For example, geograptiy is 
a very small-volume field, and il is unlikely that more than 150 
examinees will take the Advanced Geography Test at one time, 
whereas the volume for the Aptitude Test will exceed 50.000 
examinees. In the case of Advanced GeographyL the Small sample 
size is balanced to some degree by the fact that the sample 
represents ttie population, in alf casesL successful GRE equating is 
faciNtated by the standardization and tight controJ of the test 
administrations. 

The choice of equating method is dictated in part by sampling 
possibilities and in part by test construction. The Aptitude Test is 
perhaps the most important test in the GRE battery and the most 
widely used. For security reasonSn it is constructed so that each test 
item appears in only one form. This fact, combined with the 
availability of targe samples, permits the use of an equating method 
that should not be used for the Advanced Tests. 

A review of the most fundamental method of equating is a useful 
preface to discussion of the methods used for equating the GRE 
Aptitude Test and Advanced Tests. Jn thismethodL the oJd^ormand 
the new form Of a test are administered to the same group of 
examineeSL and the assumption is made that performance on the 
second form is in no way affected by the fact that the examinees 
have previously taken the first form. The scale of measurement of 
the second form is then transformed in such a way the: the fre- 
quency distribution of the transformed scores will be statistically 
equivalent to that of the raw scores on the first form. This can be ac- 
complished through a iinear transformation that sets the raw-score 
mean and Standard deviation of the new form equal to the cor- 
responding raw-score StatistrcG of theold torm 

Let 

X = raw score on the new form^ 

Y ^ raw score on the otd form, 
ir^, M, = raw*score mean and standard deviation for the new form, 
rr,, M, = raw-score mean and standard deviation for the old form, 
a, bi ' conversion parameters for transforming raw scores cn X 
to the scale of Y. 

Then. M, = a,M^ - b,, 
and = a,o-,. 

a^ " ifJiT^ 

b, ^ M, - a.M, 



Co reversion Parameters 



ij^fi'^-m -hf. -Pit 



These parameters convert the X scores to scores or* tt>e Y raw- 
score scale. t( the old form has already been put on a scale for 
reportingn it has cor*version parameters A^ and Thus, the scores 
on the new form can be put on the same scate for reporting by us- 
ing the transformation: 

Scaled Score ^ A, X j 6,. 

where 

A, - A, a, and A. b, - B. 

This fundamental method has a serfous weakness m that ^he 
assumption on whfCh it is based is not valid The effects Of practice 



34 



and/or fatigue are disturbing factors thai make the method unac- 
captabta in almost all circumstances. However, the idea of having 
the group taking the new form equivalent to ths group taking the 
Old form is sound and forms the basis tor other eduating methods. 
The methods actually used currently in the GRE Program are dis^ 
cussed ir) the nexl sections. 

Aptitude Test Equattng. Because the ei^aminee volume for the Ap* 
titude Test is so large, the problem of getting two equating samples 
{one for the new term and one for the old Form) that can be 
considered eduivaient i$ sotved by making use of atest administra- 
tion practice called ' spiraiiag." The test books are packaged inspi- 
rated order alternating Form A (the otd form) with Form B {the new 
form) in such a manner that at every testing center hatf the 
examinees take Form A and the other half Form B. Because the 
volume insures samples of more than 1 0,000 cases, the two groups 
are considered comparable and the rarnjom errors of sampling 
negligible. The assumption is then made that the scaled-score 
mean ar>d standard deviation for ihe group taking the new form 
Should be equal to the corresponding measures for the old form 
The computation is the same as that described in the introduction 
to equating. 

Under certain ci/'cun^stances, spiraHng cannot be used. For 
example, when the timing of ttie new form is different from that of 
the ofd fbrm, the two forms cannot be administered together in the 
same testing room. This was the situation with Ihe introduction of 
the restructured Aptitude Test in October 1977. The equating 
problem was solved by using an external equating subtest to es' 
tablish a fink between the otd form and the neuv one^ In the January 
1977 administration, four diffftrent versions 0( a currenfrbrm of the 
Aptitude Test were used to estabhsh four ofd-form equating sam- 
ples. Two versions included verbal equating subtests as Section IV 
(the pretest section) and two had quamitative equating subtests. In 
the October 1977 administration, each of The two new forms aJsO 
had four versions with the same four equating subtests. Each 
equating subtest was used as common material to link the new 
verbal measure (or quantitative measure} to its old counterpart, 
thus providing two independent links between each new and Old 
form. The equations used to establish these relationships are dis- 
cussed in Ihe following section on Advanced Tost equaling. 

Advanced Teat EQuetlng, The relatively smah samples available for 
equating the Advanced Tests make the spiraiing technique unde- 
sirable because the random sampling error would be unacceptabiy 
high. Therefore, a different procedure for establishing equivatent 
samples is required. The new-form sample generalJy consists of alt 
examinees tested in the current administration. The old-form 
sampte is selected from groups tested in previous administrations 
and is matched^ insofar as possible, to the expected performance 
levef of the new-form group. Both forms of the test contain a com- 
mon subset of items That represents the total test in content 
and statistical properties Since the two forms of the test are 
paralJel it is reasonable to assume that the practice effect of taki>ig 
The equating subset on the scores obtained on the total test (S Che 
same for boih groups. The observed relationships between total 
score and edtJating subset score for the two groups are used to 
make a statistical esfz/ndfe of the totai-score mean and standard de- 
viation of the combined groups on each of the two forms Thus, we 
now have One sample (the new- a ■ ' o^d-form samples combined 



into one sample), with estimated mean and standard deviation on 
the new form and est//nated mean and standard deviation on the 
old form. These estimated values are then set equal to each other 
by apptying the Procedures described in the introduction to equat- 
ing. 

Two methods tor estimating the means and standard deviations 
are used. The first, proposed by Ledyard Tucker and referred to es 
the Tucker ed^atlons, is appropriate when the new^form and the 
old-form samples are weM matched. The equations for estimating 
are given betow ih the notation used in the introduction to this sec* 
tion, expanded to mcJude sampfe identification and several addi- 
tional terms. 

Let 

X - the score on the new form taken by group a, 

Y = the score on the old form taken by group A 

V - the score on the ed^^t^rig Subtest taken by the totaJ group t, 

wherei =Q ^ 
M,^ the estimated mean of the total group on Form X, 
C,^^ - the estimated variance (square of the standard deviation of 
the total group on Form X), 

Then M,^ = + ^ ^M,^ - M.^V 



.n. C,.,=C...,.(^-^J(c.,-C.) 

These equations provide estimates of the total-group mean and 
variance on Test X, There is a parallel set of equations for Test Y 
based on the observed statistics of the ^group. From this point on, 
the procedures are those described previously for equating means 
and standard deviations^ The disadvantage of this method is that it 
is based on two conditions that are not always satisfied in practice: 
that the two forms are parallel and that the ^o samples are similar. 

The Levtne (1955) equations (also caMed the rriajor axis equa- 
tions) were developed for use i*^ those situations not appropriate 
for the Tucker equations^ Id practice it is not always possible to 
select similar equating samples, and the Levine equations are 
preferred in this case. There are four sets ot equations tor estimate 
ing total-group means and variances when both total test and 
equating subtests are scored iri the same way, (in the GRE Program 
scores are all computed by the formula: Rights - k x Wrongs.) 
Two sets are based on the condition of equal reliabilities of the two 
fOrms, and the Other two on the condition of unequal reliabilities. 
Each of these pairs is further categorized oy the location of the 
equating subtest: internaf (included in the totat sr^re) or external 
(equating subtest separate from the total tesli. For most Advanced 
Test equating experiments, the reliabilities are assumed to be equal 
(same timing, same number of items, and parallel forms) and 
the equating subtest is part ot the total test. 5or some experiments, 
the reiidbifitles are assumed to be unequaJ because a significant 
change has occurred. On rare occasions, the equating subtest may 
be administered as a separate test. 

Using the same notationai system that was used in the preceding 
explanations, we have four sets of equations for estimating total- 
group means and variances: 

35 



•5:; 



1. Equal ratiabiNties. equaling subtest included in the total test 

For this case, and also for the thre^ other cases^ there is a 
parallel set of equations for Form V based on the observed 
statistics ot the ^ group. This set of equations ts the one most 
frequently used in equating theQRE Advanced Tests. 

2 Unequdtretiabilities. equating subtest included in the total test. 



Th6 factor 0 tn the estimate of the totahgroup variance appears 
m botfY estimates of ver^ance (for Test V as wbu as for Test X) and 
consequentty drops O'jl in the computation of the conversion 
parameter. a^. Therefore, there is no need to compute it. These 
equations are used for equating Advanced Tests that have un- 
dergone some change. When the timing^ for example, was 
decreased from 160 minutes to 170 minutes in 1972. this method 
was used for the tests 

3 Unequal roliabjiities. equating subtest external to the tota* test 



- C - 



As in the second case given above, the factor 0 appears rn botti 
estimates of variance and consequently drops Out tn the com- 
putation of av ^rid there IS no need to compute it, ThiG method is 
used when the new form has no items in common with the old 
form and differs from it in a nontrivial way. This was the case 
with the equating of the verbal and quantitative parts of the first 
restructured Aptitude Test forms introduced m October 1977. it 
would be used with an Advanced Test onder unusual circum- 
stances, but this is not liKeiy to happen 

4 Equal retiabilities, equaling suhtest external to the total test 



C.. , - C 



ThiS fourth set of equations has nor yet t)een used in the 
Graduate Record Examinations Program, but it may be used for 
Aptitude Test equating in the future The strongest advantage of 
this method is that it permits the mtroduction of a new form that 
has no items m common with Previous forms 



In a practical equatm^ experiment, ft is usually not possfbJe to 
predict that the two equaling samples wtU be well matched. 
Therefore, the common practice Is to apply both the Tucker and the 
appropriate Levine methoda and exercise the option of choosing 
the more appropriate method when the sample statistics become 
known. This Is done by comparing Ihe performance of thetwosam^ 
pies On Ihe common measure, the equating subset v: 



{Test of Significance for the difference of the 
means J 



C,v„ [Test of significance for the difference of the 

^'^^ "^c;^^^'^^ standard deviations] 
p 

If both those stalements are true in a given equating experiment, 
then the two samples are sufficiently similar and the Tucker method 
is preferred. If either one is not true, then the samples are 
sufficiently ditfereni to makethe Levine estimates the better choice. 



SubtcOf • Scaling 

Subscore reporting was first used in the Graduate Record Examina- 
tions Program in i965t whan a revised version of the Advanced 
Speech Test was introduced. (The test was withdrawn in 1970.) 
Subscore reporting was next used with the Advanced Geography 
Test in 1969. In 1972. subscore reporting service was expanded and 
now includes nine Advahced T98ts: Biology. Engineering. French^ 
Geographyt Geology, Hlstoty, Music. Psychology, and Spanish, tn 
each case, the subscores tor Ihe first form were scaled by setting 
the subscore mean equat to one^tenth the total-score scaled-score 
mean and the subscore standard deviation equaf to one-tenth the 
totahscore scaled^ecore standard deviation. Thus, the subscore 
scale Js a two-digit scate directly related to the total-score scale and 
having a maximum permitted range of 20 to 99. For example, if an 
examinee has an Advanced Biology score of 600 and is a biology 
major* his or her performance on the three subscores is MKeiy to be 
in the range of 56 to 64. 

After the initial scaling of the subscores. some problems were en- 
countered \n using common-item equating {equating subtest) for 
the subscores of subsequent forms. Both theTucker and the Levine 
methods require each new form to have items in common with the 
ojd form. For sound equating, each lest or subtest must include at 
least 20 items in common with the otd form, ThereforOt a test with 
three subscores woutd require at least 60 items in common with the 
old form, and that amount of overlap is not acceptabie for security 
reasons. A compromise solution to this problem is to scale the sub- 
scores of each new form through the totat score, as was done In the 
initial seating of the subscores^ In the caae of the Advanced 
Geography Test, the results of this kind of equating were compared 
with the common-item equating method in t969t and the (wo 
methods were found to yield almost identical results. Since that e^^ 
periment. the scaling procedure has been used to place the sub* 
scores of new forms on the GRE subscore scale for reporting. 

Stability of the Scale 

The preceding discussion of total-score equating explains how the 
form to-form relationships are established. It implies a sequential 
operation extending over a long period ot time in which each new 
form is equated to Its immediate predecessor, ir that procedure 



36 



ERLC 



were followed m pranuce without modification^ the statnlily of the 
scale woLtid be in jeopardy and »ctiie "wobble" could develop. ThiS 
presents a serious problem in a program such as GRE. in wh^Ch as 
many as fWe test forms may be Ltsed interchangeably in one 
academic year and more than ten forms in a five-year period Com- 
parability of scores across forms is such an essential part of the 
QqE score reporting service that considerable effort is expended in 
estabftshmg and maintaining scale stability. 

One soJuf^on to this probfem is the use of a statistical technique 
called double part-score equating. In this procedure, each new 
lorm of a test is equated to two old forms instead of one. thus ob- 
taining fwo conversion lines relating scores on the new '^rm to the 
GRE scare The final conversion line is the bisector of the angle 
formed by these two tines 

Lei 

Y Al X 6, be the convers'on fine obtained by equating Test X 

to one Old form. 

and 

Y A.. X ' B The r.onversion Irne obtained by equating Test X to a 

second Old tOrm 

lUof) the bisector of these two lines is Y ^ AX ^ B, where 

A, \ r " ^ 



A/ ^ A. \ 1 - A, 



V 1 



A/ ^ \ 1 



B, \ 1 - A.' ^ B. \ 1 ^ A,- 

B _ ^ - 

\ 1 ■ A." - \ 1 - A,- 

The effect o1 thiS orocedure is that small equating errors are 
averaged out over a period of rime. 

pfanning o\ the sequence of double equaling involves braid- 
ing, a technique for establishing form-lo-form relationships that 
reduces rhe danger of developing distinct scate strands. ' Tne 
equating seGiuence shown in the diagram below illustrates this prin- 
ciple For each arrow inking a new ^orm to a previous form, there 
must be a set of equating items common to both torms 



Form A is the tirsl form tn the series. Form EJ is equated to Form A 
Then Form C is eQuated to both A and B. and the bisector of the 
angle between the two equating lines is used for the final convert 
sion. In the same manner. 0 is related to both C and A. E is related 
to B and 0. and f rs related 1o E and C. These equating practices are 
effective in establishing and maintaining a score scale with the 
following Characteristics for each test; 

1. The number designating a particular point on the scale 
represents the same fevei of devefopment or achfevement 
regardless of the test form on which the score was earned, 

2. That number represents the same level of development or 
achievement for any individual or group of individuals taking the 
test. 

3. That number represents the same ievei of devetopment or 
achievement for any form of the test administered Over en 
extended period of time. 

Tho eifectiveness of these procedures is illustrated in Figure 1. 
which shows the mean scaled scores of the National Science Foun- 
dation (NSF) Graduate Fellowship appJicants in engineering from 
1960 to 1970. 

Figure 1; Level ot Performance o1 NSF First-Year 
Engineering Applicants 



Scaled 
Scor* 

750 
700^ 

eso 

600-4 
550 



QuintUatlv* 
EnglntttrlrH} 



VertMl 



1960 



1962 



1964 1966 
Administration Yair 



1966 



1970 




The NSF appJicant groups applying for first-year graduate feHow- 
ships are similar from year to year^ The expectation is that their test 
performance will exhibit the same Characteristics Over a 10-year pe* 
nod. ' Te graph shows that their mean scores on the Advanced 
Engineering Test and the Aptitude Test between i960 and 19^0 
have remained as steady as can be expected. 

Stability of the Scaled*Score System 

There \% yet another aspect of scale stability that, although de- 
pendent on the stability maintained through equating, can be af- 
fected by factors outside the statistical considerations described in 
the Precedmg sections, fn the description of thedevefoPment of the 
GRE scaled -score system, emphasis was placed on the relationship 
between Aptitude and Advanced Test scores and the tact that the 
scale for each Advanced Test was designed to reflect the abihty 
level of the population taking the test. Figure 1 shows the stability 
of that relationship for the Advanced Engineering Test. The 
engineering applicant group is, as one would expects a relatively 
hiQh ability group, and the verbal and quantitative abihty score 



37 



mf^ans are well above the GRE average The mean scores on the Ad- 
vanced Ertgmeering Test are cor^ststently betweer> the verbal and 
quantitative means, thus showing the expected relationships 
amor>9 the (hree scores and demor>stjating The stabiliiy of the GRE 
scated-score sysiem as fax as ^he Advar>ced Engineering Test is 
concerned 

The GRE scaled-score system is defir>ed in terms of the bastc 
reference group of 1952 and the educatior^ai experience of that 
group. What happens when the educatior^al experience oJ more 
recBr>t groups differs drastically from that of the 1952 group'' A few 
years before i960, there was a sfrong movemenf of matherriafics 
curriculum revision that introduced concepts of modern 
mathematics into the high school curriculum and then inlf the ele- 
mentary school curriculum The effect of this movement was to 
speed up the mathematics experience of some students before they 
reached colfege and to enable them to do college level worh while 
still in high school. Consistent with that curriculum change is the 
change ir> peHormance on the GbE Advanced Mathematics Test 
Shown in Figure 2. which presents the mean scores of NSF first- 
year applicants in mathematics 

Rgure 2. Level of Performance of NSF Fir9t*Year 
Mathematics Applicants 

Scoro 



M«(hem*1tc« 




T T 1 T r r r 1 1 r 



I960 1962 1»e4 1966 1966 1970 

Administration Y^ttr 

The relationship between each Advanced Mathematics Test mean 
^nd the correspondtng Aptitude Test means is near the expected 
vaJue from t960 to 1962^ but the consistent upward trend in the Ad- 
vanced Mathematics Test means is not accompanied by a cor- 
responding trend m the Aptftude Test means. From 1964 to 1970 
(and contrnumg to the presenr). rhe Advanced Mat/iematfcs Tesf 
performance is significantly higher than one would expect from the 
Aptitude Test performance 

Rescaling Study of 1 967-68 

The trend observed in the case o* the NSF first-yea: malhemarics 
applicants was aiso evident m the general GRE population for other 
lests. \[ was inevitable that lOng-range factors wouid change the 
meaning of the scaled-score system and affect the interpretation of 
GRE scores Among the changes that were operating since the es- 
tablishment of The scale were changes in the GRE population (ap- 
proximately a threefold iricrease m Aptitude Test volume from 196^ 
lO 19701, curriculum Changes at the high school ^evel (CBAP 
Chemical Bond Approach Proiect. CHEMS Chemical Education 
Material Study: BSCS. Biological Sciences Curriculum Study 
PSSC Physical Science Study Committee. SMSG School Mathe- 
matics Study GrOuP. etc ). arid changes m the relationshipsamong 
me> GRE verbal ability . quantitar:veabi1ity . and Adv-inced Test scores 

18 



By 1968 the accumulated evidence of changes in the meaning of 
the scaled-score system was sufficient to warrant a statistical inves- 
tigation. A rescaling study based on the 1967-68 scores was begun 
in 1968 to determine what had happened to the scale in the 15 years 
Since its devetopmer>t. A report of the results of this study was pre- 
pared for the GRE Board (Wallmark, 1969)> showir>g tne magnitude 
of the char^ge for each test. As had been predicted, the major 
changes occurred in mathematics and the sciences. The results o( 
the study are sumniarized in Table 16, which shows the means and 
standard deviations on the Aptitude Test and on the Advanced Test 
'or each Advanced Test group. A comparison of the otd-scaJe and 
possible new-scale statistics shows the magnitude of the adjust- 
ment that would result from a rescaling. Note in particular the 
decreases in standard deviations for the Advanced Mathematics 
and Physics Tests. 



Table 16: Seated Score Means and 
Standard Deviations of the 
1967-68 Rescaling Samples 















H 












4*l*l 
















S.D 








s a 


PiClOtV 


4.«96 




JLO 


580 


1D4 


ei4 


107 


544 


106 






54a 


119 


663 


92 


&L3 


100 


588 


ID? 






552 


na 


6IS 


107 


621 


UL 


567 


110 




2.74« 


4/2 


95 


465 


104 


471 


74 


476 


9S» 






503 


119 


697 


72 


619 


102 


624 


91 




1.2« 


586 


106 


512 


105 


559 


ZS 


556 


!03 




308 


5U 


101 


534 


107 


502 


S6 


5LC 


102 


Geology 


575 


5JtO 


m 


615 


93 


SSI 


96 


554 


102 






567 


L09 


515 


110 


bW 


80 


537 


104 


lileiaTurc 


6,276 


607 


101 


516 


109 


570 


85 


580 


100 


WalhcfTialics 


3.279 


558 


119 


700 


ai 


653 


150 


6,32 


97 




M7 


5L9 


113 


502 


114 


5l8 


91 


508 


107 






G25 


iC4 


573 


119 


649 


m 


586 






2.1W 


*)78 


119 


712 


?3 


614 


13a 


632 


9^ 


Political 






















2.745 


568 


MO 


536 


113 


519 


90 


542 


!C6 




5,&43 


56? 


104 


b59 


L09 


544 


13 


54C 


104 




2.151 


537 


119 


502 


1?3 


533 


116 


522 


116 




770 


545 


ni 


479 


L04 


545 


lot 


518 


101 




«95 






466 


L01 


45a 


8b 


502 


09 


Fniire SCdhfvf 






















4S.39a 


b54 


116 


574 


12B 











The rescaiing study demonstrated that the across-field com- 
parisons of. Advanced Test scores that were possible with the 1952 
senior popuiat»on could not be made with the 1967-68 senior popu- 
lation; and such comparisons cannot be made with a current 
group*. The GBE Board considered the implications of the study ot 
the scale and deHberated over the possibility of rescaling the Ad- 
vanced Tests Rescaling would have permitted users to make 
limited comparisons of Advanced Test scores tn differeni fields 
However, rescaling would also have meant that Past scores and f e- 
scaied scores for a given Advar>ced Test would not have the sam« 
meaning and could not be compared. The issue was fundamentally 
this: Which will be more beneficial to most users— the ability 10 
compare the Advanced Test scores of people in differeni fields tail- 
ing different Advanced Tests or the ability to study scofe trends e'- 
fectiveiy within a field? Although feltowrehip sponsors m^ghl find 
comparisons of Advanced Test scores across fields to be important, 
most institutions using the scores would be interested in compar- 
ing the scores of students Taking the same Advanced Test and 
would find comparabihty across years to be essentia^ to study of the 



4 ^ 



ERIC 



usefulness of the scores ."^vef a period oi tirne. Thits^ il was decided 
to continue use of ihc orlginat ftcafOn to rneet the needs of the ma- 
Jorily of users, and to provide rescahng information to fellowship 
sponsors requiring the capacfty for across>field comparisons of Ad- 
vanced Tests to supplement ihe across-fiald comparisons already 
possible using the Aptitude Test. Thus, the Guide to the Use of the 
Graduate Record EnSmimhons instructs users that scores across 
Advanced Tests are nof comparable although scores from oney^^c 
to the next for a given Advanced Test are comparable. Alihoogh the 
linkage o1 the Advanced Tests with the Aptitude Test has not been 
sustained, for historical reasons any newly introduced Advar^ced 
Test is scaled by the same method used in the past, providing initial 
linkage in the year of introduction 



RellebllBtv end Error of Measurement 

Reliability is the extent to which a test is consistent in measuring 
whatever it does measure, it indicates how much of the variation m 
the resufts of testing a group of individuats can be attributed to the 
systematic sources of variation one is trying to nneasure and how 
much to other sources of variation mat may be classed as errors of 
measurement. The index of reliabiiiiy is usuaMy slated as a correla- 
tion coefficient for two sets of similar measurements and may be in- 
terpreted as the ratio of the irue-score variance to the observed- 
score variance 

One method of estimating the reliability of a test is to administer 
the test twice to the same group of individuals and to correlate the 
two sets of measurements. Because this method has both statistical 
and practical disadvantages, it is seldom used For the GRE this 
procedure would introduce memory as an unwanted factor of 
systematic variation, thus overestimaEing iho reliability. The main 
practical disadvantage for GrE that the three-hOur time i>mii oi 
each testing session ^s excessive for a double session and wOuJd in- 
troduce factor f>t fatigue 

The statisticai disadvantage of the test-retest method can be 
Overcome by using an alternate parallel form of the test for the 
second testing Although this second method does not solve the 
problem of ihe practical disadvantages mentioned at>ove. it ts the 
preferred method when speed is an important factor in the scores 

The sp/fl-halves method, a variation of the paraNel-forms method, 
can be used for eatimattng me reliability of tests thaf are Sufficiently 
long One form of the test is administered to a grouP of individuals 
The :^t IS divided into halves for scoring, and the correlation of the 
two sets of scores is uE^^d as the reliability estimate after correction 
for tesi length The difficulty with this method fies in spotting the 
test rnto halves that can tte considered parallel m both content and 
statistical properties The mam advantage, however, is that the ef- 
fects of content samplino are considered withoui the effects ot 
memory or response variation over time. 

The method of estimating reliability coefficienls used in the GrE 
Program employs analysis of variance procedures with a single 
administration oi one test form This method, proposed by Kuder 
and Richardson (1937) and further developed tiy Dressei (1940)^ is 
based on mterrtem correlations and lends itself vweit to computer 
processing The formuta. nOw known as Kuder-RichardsOn tormuia 
{20). ;s given beiovw 

n 1 ,F/' 



where 

r[, ^ the estimated reliability of test i, 
^T^^ ^ the observed variance of test t, 
n ^ ihe number of items in test t, 

and 

pq - (he average of the item varidnces^ 

This Kuder-Richardson formula (20) has been adapted for formula 
scoring (see page 43K It provides an average of all reliability 
coefficients that can be obtained from all possible ways of spotting 
the test. 

There has been considerable debate among measurement $pe* 
aaUsts as to the approprtataness of the Kuder-Richardaon forr^uia 
(20) in estimating the reliability of a test when ail examineas do not 
finish the test. The criticism is important for the <3RE Program be- 
cause some of the Advanced Tests Show moderate speededness 
when the usual criteria are applied. However, a good portion of the 
' speededness" may reasonably be attributed todifflcuity, Frances 
Swineford (1973) made a comparison of the Kuder-RichordSon for- 
mufa (20) resuits obtained on moderately speeded forms of the 
College Board Scholastic Aptitude Test with results obtained by us* 
ing three other methods and demonstrated that the Kuder- 
Richardson formula (20) can be used with confidence for estimat- 
ing thereliabilityof test^ that are moderately speeded. 

Reliability coefficients indicate the portion of score variance that 
can be attributed to lrue*score variation^ buMt is of limited value for 
test users. The standard error Of measurement is the statistic com- 
monly used to interpret scores because <t is a measure of reliability 
stated tn score units and indicates the probable range of dis- 
crepancy between the observed scores and the true scores of a 
group, in interpreting test scores for a group. One may say that the 
true score for each individual tn the group Mas within an interval ex- 
tending from one standard error above to one standard error below 
the observed score and expect to be right about 2 times out of 3. By 
extending the range to two Standard errors above and below the 
observed score, one increases the proportion of correct statements 
to 95outof 100. 

Item and Test Analy«is 

Equating Is based on the assumption that the various forms of a tost 
are parallel in both content and difficufty. To the extent that this 
condition is not met> the standard error of measurement computed 
from a dJsfribtitfon of scores earned on a number of d^fferenJ forms 
would be higher than that estimated from scores on only one form 
of the test, tt is obvious, therefore, that achieving parallelism is an 
important aspect of test construction, and quality control 
Procedures most be established to evaluate the success of that ef- 
fort. The specifications for any given test within the GRE Program 
consist of three principal elements; 1) the distribution of item 
content and skills to be assessed: 2) the distribution of item 
difficulty; and 3) the distribution of ilefT>-test correlations. This sec- 
tion is concerned with the two latter elements and with other 
psychometric properties of the tesL 

Item Analy«ls 

iiem ana/ySfS is a stdtfstrca/ procedure that provides detailed m- 
formation about each item, describing the relative attractiveness of 
the options, the difficutty jevet of the ^t^rn for the anatysrs sample. 



39 



Ihe power Ot ihe ilem to discnmmale among the examinees wiTh 
respect to a given criterion, and the way the item functions in a 
particular test. 

Before an item is used m a final form of the Aptitude Test^ it is 
pretested and analyzed to identify possible weaknesses and to de- 
termine difficulty level and discrimination. Items that pass inspec- 
tion ere then placed in an item pool to make them available for test 
assembly: those that do not pess are rewritten to correct the flaws 
or are discarded. In the GflE Programn pretesting serves a useful 
quality control function for Aptitude Test construction. 

For the Advanced Tests, although analyzed items are used as 
equators, reliance *s pfaced on the subjective judgment of the com- 
mittee of examiners in estimating difficulty level and discriminat- 
ing power of newly written items. To a remarkable degreft. this 
procedure is effective. Nevertheless, when a new form is first 
administered, it is Subjected before final scoring to a prehminary 
item analysis for the sole purpose of identifying items that may be 
ambiguous or faulty. For most forms no problems emerge. When a 
problem is identified through the analysis and is confirmed by the 
subject matter specialists, the item is dropped from scoring if there 
is no correct response or doubJe-keyed if there are rwo correct 
responses. Since GR€ Advanced Tests contain an ample quantity of 
items, reliability remains above 90 in spite of the deletion. 

Norm«lli«d Total^Qroup Method of Item Analyalt. The method of 
item analysis used in the Gr£ Program is caNed the normalized 
totai-group method. A sample analysis of an Advanced Biotogy Test 
item is Shown in Figure 3. The item analysis form consists of two 
PartSn the upper grid showing the tally of how many examinees in 
each quinttle group selected each distracter. and the lower strip 
Providing theessentiai item statistics 



Rgure3: Item Analysis Sample 





6001 


'11^ eioL'^ 


It 































































41 








i 




ttiv<l 




10 








0 








lA 






Z 








1 






16 


V 


6 








11 1 


J1 J 


3l ! 


li 1 






3 : :) L 0 fi T ■ 


^1 ' 10,0 


1 ISO* 




^TT " - -T^ — ■ 

6 4 9 ^^^ 
4 fl*? 


T. 

■ T /OO 


0. M no 





Undf*r optimism conditions. The Mem anaivs*s sample is seiected 
to be representative of the population for which the tesi was 
designed and to be adequate m sjze Tho critenon selected for the 
analysis is usually the total score on the test or subtest, but may be 
an external criterion related to the item type Thedistributionof cri- 
rerion scores is then convened to a normal distribution with a mean 
set at i3.0 and a standard deviation at 4.0 The transfO'-mation is 
shown in Figure 4. which shows the left half ot a normal curve 
divided mto sections one-fourth standard deviation vvide. Each sec- 
tion represents a converted score, and the area of that section indt- 
cates the proportion ot observations m The interval identified on the 
graph by the value of its midpoint Fot e^ampio. if The sampie s ze is 
1 000, The two lowest scores will oe converted to l M.OOO * 002C), 

40 



The next two will be converted to 2{t.000 ^ .0023, rounded)^ and so 
on. The cumulative area cA is used to resolve rounding problems. 
Thus, each individual in thesampte is assigned a converted score in 
the range of 1 to 25. and this score is used in the item analysis com^ 
putatfon. 



Rgure4: Criterion Score Conversion 

r 




The item anatys^s sample is divided into fivo equal subgroups 
based on Ihe criterion score: the towest fifth, the second fifth, the 
middte fifth, and so on. The responses of the sample are tallied and 
the frequencies entered in the grid of the item analysis form. This 
display is used to determine how each option functions in dis- 
criminating among the examinees. The total frequency for each 
response and for omissions is entered in the strip at the bottom of 
the form with the mean converted crilerion score for that group. In 
Figure 3. for exdmplen t28 examinees omitted the item, and this 
group has a rriean criterion score of tO,0: t,l50 chose the correct 
response A and have a mean criterion score of 14.2. The right-hand 
portion of the form shows the mean criterion score of the total 
group (M,„i„i) to be t3.0. the proportion of the group reaching the 
item {Pu^t^f}) to be t,00. and the proportion answering correctly (P,) 
to be .73 P. is translated into a difficulty index. A,>. which is also on 
the 1 3.0/4.0 scale. The biseriai correlation is determined by the for- 
mula 

1 P. 

where 

M, is the mean criterion score for the group answerrng correctly* 

iT is the standard deviation for the total sample (set equal to 4.0 in 

this method). 
P. is the proportion answering correctly. 

and 

y is the ordinate m the unit normal dislributton which divides the 
area under the normal curve into P, and 1 - P. . 

T P. 

The reiatrOnshipS between P. and A and between P. and 

A V 

w:t^n Mt^^^.f is 1 3,0 are shown m figure 5. 



Figure 5: Relationship between F+ and A when 
Mu,t«, = 13.0 



1.0 1.2 1,4 1^ 1J 



8 



5 
£ 



W J 

1 

2 
5 
10 


1 


' ' 1 

■ ] 






























"T — r 


.■l - 








— 




— 


















~r 


1 

1 
i 








i-ii 














II-' 

|[:^ 










.,4_.._ 

— j — 






- 




















1 

.... 


20 
30 
40 
SO 
60 






1 — ] 

DOI 


1 — ! 
\- 


1 

' 36 






















— V 



















































































1 


70 
«0 




























































































90 
95 

96 
99 


- 












1 










f :■ 










: 






















n 






1 

i 




















- 




<\ 


r 

t 


I" 


1 

■■■t ■■■ 

"O- 














i . 








1 , 


r; ■ 
■[-[ 

! i 


!■ 




i 

i 







X -2,0 -1.5 -1.0 -0.5 
d 5.0 7,0 9.0 11.0 



0 0.5 1.0 1.5 a,o 

13.0 15.0 17.0 19,0 21.0 



FOr tests that tend to be difficult for ttie group, not all examinees 
find time to attempt all items. An assumption is made that an indi- 
vidual has attempted every item up to and including the last one 
that he or sha answered and did not reach any of the remaining 
items, fn the typical case^ the dropping out begins in the rowest 
ability group, with ^ consequent chahge in the ability level of the 
group that does reach the item. M,,^tnt is based on the scores of 
those who. by this definition, reach the ttem. This is not a perfect 
description of whaJ occufs. but. provided that the proportion of 
examinees who drop out is not extremely large, it is a reasonable 
basis for using a variable t^^,,tA\ i" computing X. when speed and 
power are highly correlated, as is tMe case in the GRE tests. 

The dropping out is indicated in two ways on the item analysis 
form: the last row of the grid shows the number of examinees in 
each group who have reached the item, and the P.^t.^t box in the strip 
shows the proportion of the total sample reaching Itie item. Mi,,,j, is 
the mean criterion score for that group and usually increases 
toward the end of the test. The difficulty index is then computed by 
the formula 



4x. 



Where 



X ^ the deviatiorv from thfi mean of the normal ctr^'ve corresponding 
to P.. stated m standard deviation units, as shown in Figure 5 

The factors that affect r^^ include the degree of independence of 
the item and the criterion {item included in the score or iiem not in- 



cluded)^ the nature of the criterion (same subject matter as the item 
and tiomogeneousn different subject matter^ or mixed subject mat- 
ter), and the range of the raw criterion (less than 25 raw-score 
points or greater), to atf^ jn interpreting thebiserial correlation for a 
particular analysis^ a coded description of the criterion score is in- 
cluded in the upper right-hand box of the item analysis sirip. There 
are three parts to the code; the first letter indicates the location of 
Ihe item with respect to the criterion (1 means internal X meam 
external); the second letter indicates the mature of the criterion 
means that the subject matter is homogeneous and the same as 
tnat for the item, D means that the subject matter is different^ and M 
means that it is mixed); and Ihe number indicates the number of 
items in the criterion {if the item is inctuded and the criterion is 
based on fewer than 25 items, the biserial will have a spurious 
component). 

In the GRE Program^ a recurring problem with item analysis is 
that the sample avaiiabie for item and lest analysis id not 
representative of the total group for which the test was designed 
because groups t3i(ing the test vary by year andt within a year, by 
administration date. If the sample is very able, the observed deltas 
will be relativety low. If two forms of a lest have been anatyzed with 
samples of different abiiily* the observed deltas cannot be direclly 
compared. This probfem is sofved by estabfishir^g a basescafewrth 
a sample selected to be representative of applicants for admission 
to graduate school. In the GRE Program, the basic reference 
sample for item analysis for each test consists of the seniors who 
were tested in the academic year 1962-63. The samples for the 
analyses of the next new forms, which were introduced in 1964-65, 
were selected to represent the reference groups and were used to 
establish the A scales. In all subsequent analyses, the observed 
deltas. X' were equated to place them on the base scale, and the 
equated values, \j were entered on the item analysis form along 
with the scale identification. 

Delta equating is accomplished by relating the observed results 
of an item ana|ysis of a test to the standard reference population by 
including in ttl% test a group of items that have been used in the 
program and fdr which equated deltas are already i^nown. This set 
of items can be^ and usually is. the set used for score equating 
where common items are used for equating, A scatterpiot relating 
the observed deltas obtained in the new analysis with the cor- 
responding equated deltas from a previous analysis is then used to 
generate an equating iine. "Jhis line is then used to estimate 
equated deltas for the new items in the test. 



where a 



ffT. , and b =■ M . 



a A,, -^b. 
-aM, . 



Test Analysis 

Test analysis is the final stage in the test development process. 
When a final form of a test has been administered. \\ is subjected to 
a detailed analysis to determine the extent to which the test 
specifications have been met and to provide information that can 
be used to guide future test construction. The analysis data are 
used to evaluate the test as a whole, to determine its efficiency, its 
discrimination in the score region used for selection decisions, the 
reliability of the reported scores, the intercorrelations among 
reported scores and special subscores^ its speededness charac* 
teristics. the distributions of item difficulty indexes and biserial cor- 



ERIC 



41 



4:^ 



rerations for the reported scores and SporjiaJ subscores^ and spectai 
score date to stody examlnees^ patterns ol response to test qoes- 
ttonSr All this inlorrnatjon is condensed in a test analysis rePOrt^ 
written as a with^n-offlce quaJity-control docurnent providing m- 
fonnation for test specialists and test committees, for the research 
staff, and for clients and test experts who review the tests. 

The first step in the ptanning of a test analysis is to select an ap- 
propriate sampte. Under ideal conditions^ the analysts sample 
wouJd represent the population for whtch the lest was designed. 
Under practical conditions this ideal is seldom posstble^ Although 
the sample may not represent the target popufation. it almost al< 
ways represents thetotat group e^icamined at the lirst administration 
of the test. The sample siie shooid be sufficiently large to ensure re- 
tiable resoits. Between 600 and t-OOO cases in the sample is 
considered adequ«>te. but, if ttie total group tested at ihe fifsf 
administration is less than 3.000, alt cases are generally used in the 
sample, if the total group is much larger than that, a sample is 
selected by taking one case out of every n cases to ensure a sample 
stze of approximately 1,000 cases. If the total group is much smaller 
than 600. a decision is made whether to do the analysis with the 
available cases or to delay until additional cases from some future 
administration of that form become available and can be combined 
with the original group. 

The data pfocessing phase of the test analysis work produces li>e 
following computer output: distributions for all reported scores and 
item analysis criterion scores, item analysis, a detta vs. r-btserial 
distribution sheet for each set of items analyzed, the results of the 
delta equating, an intercorre^ation tabfe of all variables used in the 
analysis, and a test analysis taboiatfon for each separately timed 
section ol the test. This detailed information is then analyzed and 
summarized in the test analysis report, 

Examples of the test analysis data for a recent form of the GRE 
Advanced French Test are- presented in Tables 17 to 24. Table 17 
Presents two frequency distributions, one based on the perfor- 
mance of the total group of examinees who took this particular test 
form when rt was first administered in October 1976 and the other 
based on the performance of the total group examined during the 
year Irom October 1975 tf^roogh September 1976, As can be seen 
from the first distribution^ the number of cases available for the 
analysis sample is well below the number considered to be ade- 
quate. Although that number is small, a comparison of the tolal- 
groop mean and standard deviation with the corresponding statis- 
tics of the 19/5-76 norms group shows the total group for October 
1976 represents the norms grouP rather we*l. This fact^ combined 
with the neeO ^or some information lo help in ttie constructiort o< 
the next form, led to a decision to proceed with the analysis At the 
bottom Of Table 17 are Summary statistics and conversion data, 
including the conversion equation for translating raw scores to 
scaled scores. 

The mformation in Table 17 m us©d to describe the efficiency and 
skewness of the test. Under normal circumstances, a test \6 most 
efficrent for the group if the score distnbufion covers the entire 
Possible score range, in thfs case the maximum obtatned formufa 
score of 180 is nearly one-halt standard deviation below the 
maximum possible score of 195. This charactei istic is common 
among the GRE Advanced Te^ts. which cover subject matter 
selected from a broad range of undergraduate curricula rather 
than from one universal curncuium The test was reJativeiy diftrcult 
tor the group The mean score on the 195 questions is 66 96 A test 
Of middle difficuftv would be expected to yietd a mean formula 

■1? 



Table 17: Total Score Distributions 



Advanced French Test, Form A 
(Taken by candidates for admission to graduate schools, 
October 1976) 



























P*n«iitih 




w 
























Lvwcf LJmM 


Scort 


























180 




770 




I 


99.5 


800 


820 


1 


99,9 


J 71 


179 


740 


770 


1 


990 


770 


?90 


4 


99,6 


162 


no 


720 


740 


- 


99.0 


MO 


760 


9 


98.8 


153 


161 


700 


720 


3 


97 4 


710 


730 


21 


97.0 


144 


152 


670 


690 


9 


92.8 


630 


700 


29 


94.5 




143 


650 


670 


4 


907 


650 


670 


45 


90.6 


12G ' 


134 


620 


650 


5 


881 


620 


B<40 


68 


84.8 


117 


12& 


600 


620 


12 


82 0 


590 


610 


66 


79.1 


lOS 


116 


580 


600 


J6 


73.7 


S«0 


580 


135 


67,4 


99 


107 


550 


5?0 


21 


62.9 


530 


550 


141 


55.3 


90 


96 


530 


550 


19 


53.1 


500 


520 


156 


41.9 


SI 


89 


500 


530 


12 


46 9 


470 


490 


166 


27 6 


72 


80 


480 


500 


28 


32 5 


440 


460 


L26 


16.7 


63 


n 


460 


480 


19 


22.7 


410 


430 


85 


9.4 


54 


62 


430 


450 


15 


14 9 


380 


400 


48 


53 


45 


53 


410 


430 


10 


98 


350 


370 


35 


2.2 


36 


44 


390 


410 


8 


5 7 


270 


340 




0 3 


27 


35 


360 


380 


5 


3.1 


290 


310 


2 


0 1 


le 


26 


340 


360 


3 


15 


260 


280 


1 


00 


9 


17 


310 


330 


1 


10 










0 


8 


290 


3tQ 


_2 
194 


OO 






1161 




M, = 


36 65 
















J, =; 


33.25 


Cortv*ne<l to tht CftE sw'e 










= 


521 


through iCQft% on ^ it^^iS 




519 






=z 


89 




with one 




90 








84 50 


4nd 40 titmt 


irt ci^mmon 














afiother lorm. 












1 195 items) 


Y -. 


= 2 66l2)t i 289 3151 




r 







1 



score equat to approximately half the total number of test ques- 
tions, in this case 92.5. the score that would be expected lor an 
examinee who knew the answers to hatf the items and responded at 
random to the remaining ones. The skewness of the score distribu- 
tion ts another characteristic used for evaluating the effectiveness 
of the test construction. Skewness. or the third momenta is defined 
bytheformoia, 

<*x - 

Where X| is the deviate score (X, - of the i'" examinee and the 
summation rs done ovef the number of examjnees (N). A convenient 
approximation of skewness is given \yy the formula 3(M - MdVir. 
which in this case is ^213, This estimate is not relJebJe (or a sampfe 
Size its smatl as the N (of the total group, but. it does indicate som^ 
positive shewness. wt>ich usuajry means that the test is relatively 
difficult for the group lesied. 

Two subscores are reported for the Advanced French Test. The 
distributions of triese scoresfor the tftlai group are presented in Ta- 
ble t8. Using the same analysis Procedures, one can see that the tn- 
terPretive Reading Skills subtest is rather easy for the group and 
that the score distribution is characterized by a high mean and 
slightly negative skewness The Literature and Civilization sub- 
score, on the other hand, has a very low mean and is positively 
skewed, indicating a very difficult subtest. With thrs informat+on. 



Table 18: Subtcore OUtHbuttona 

Advanced French T«st, Form A 
(Taken by candidaies for admission to graduate schools. 
October 1976) 



2. Lil*t«bH« wtd Cl«tlU«|l«« 



1 


7" 
1 

■\- 

i 


la 


** 


i " 
[ 

4 




bit* 4« 


[ 

» 1 


T 








?; 


— 
/4 


i 


r T 

S9 S 1 90 






80 




995 


as 59 


! 


o9 


?r 




4 


97 4 




S9 


76 


7« 


; 




m 84 




b? 






3 


9S9 




S4 


73 


75 






?^ '^rf 




fi^ 


67 


i 


S 


91 8 




?9 


71 


73 


2 


' 97 4 


/O M 




62 








1 


;^ 


M 


es 


70 


7 


1 94 3 






«0 








809 


65 


6» 


66 


68 


9 


S97 


GO 64 




SJ 


59 




22 


&96 


* 60 


64 


63 


65 


6 


1 866 






6S 


SJ 




74 


S7? 


% 


S9 


61 


63 


13 


i 79 9 












1« 


49 r> 




M 


S3 




9 


?5 3 


4^ 49 




to 


52 




?\ 


3S 1 


4^) 


49 


56 


58 


12 


1 69 1 


40 M 




4/ 


«9 




2* 


2b Z 


40 


44 


53 


55 




603 


35 39 




4^) 


47 




14 


17S 


3S 


39 


51 


53 


24 


4? 9 


30 34 




K 


44 






tZ4 


30 


34 


4fi 




1 » 


402 


?5 Z9 




40 


*2 






9e 


?5 


29 


46 


48 


?? 


2^9 


ZO 24 


i 


1/ 


■M 




*> 


■ G ? 


20 


?4 


43 


45 


, ?4 


16^ 


lb !9 


J 




3? 






3G 




19 


4r 


43 




5? 


10 :4 


i 


3J 


J4 




1 


1 3 1 


- m 


14 


39 


40 


1 5 


?6 






30 


3? 




4 


1 


^ 


9 


16 


38 


4 








?S 


30 






rift 




4 


34 


36 




; 05 




i 












s 


1 


34 


1 I 

1 * ^ . 


I 00 



194 [ 



= 49 64 

^, = 17 *j 

- M 50 



Ety \cti[r>( metfi tnd 
stinairi dtvitt'On to 

t 04930 X f &63; 



L 

- 3ni 

17 ft4 

- 53 9 
> - 89 

Vd, 3500 



(103 it«(n^h 



Conwrfed lo the GM ±cale 

orT« rfnth cr tilt rotti 
(tifan and Uindard dfVit- 

If 0 49C2 X + 33^380 



Table 19A: Scoring Formulas and 
RellablHty Coefficients 

Advanced French Test. Form A 
(Description of Sample: Total gr^oup to highest 
nultipie Of 5: N = 190.) 



Stott 


Number 

or 
ibiiiti 








hht 


SuM 


[ (nterpretfve f^cadmg Spiffs 


92 


R W/4 


.926 


4.89 


2.41 


2. Literature and CiviUzation 


103 


RW/4 


.916 


5.09 


2.53 


3 Total Score 


195 


R W/4 


.955** 


7,06 


izn 



^AdaPtahon KudeK RichardiDn formula ^20>. 
**S« text 



Table 19B: Intercorrelations 



Table 19C: Speededness of Test 







2 


3 


Mmji 


1,0. 


1. 1nterpr«t*v« Reading SKilfs 




742 


,935 


49.61 


17.97 


2 lit«rature and Civilizalion 


.742 




932 


3^.94 


17.58 


3 Total Score 


935 


.932 


- 


86,45 


33.U 



Percent completing test 


61 6 


Percent completing 75 percent of test 


100.0 


Number of items reached 




80 percent of the candidates 


194 


Total number of items 


195 



the committee of c-Kaminers determines whether or not the high 
difficulty level fs consistent with iheir knowledge of French litera- 
ture curriculum design and emphasis. 

Table 19 rresents ttiree Kinds o^ information the estimated relia- 
bility of the reported scores, the correlations among the reported 
scores, and the speededne&s of the totat test The relidbility of each 
stibscor© was estimated the Kuder-Richardson formula (20) 
adapted by Dressel f 1940) tor use with formula scoring 



n 1 I 



where 



r> - the niimber of item!:. 

p the proportion of examinees answering correctly. 
Q - 1 - P^ 

p'.q' ^ the corresponding proportions for incorrect fespoi^ses. 
k - the corrijctton factor in the formula scoring. 
- the standard deviation of the score distribution. 

and the summations are over n items. The corresponding formula 
for the standard error of measurement, which translates the relra- 
bility mtormation mto a statistic measured in score units. i5i 



SE^rf.. - ^r, \ t reliabihtv 

The estimateci reliability of the total score is computed by the for- 
mula 



reliabiliry - 1 



error variance 
total variance 



where the error variance is the sum of the squared standard errors 
of measurement of the two subscores. and the total variance is the 
variance of the total score on ttie test. The total-score reliability ob- 
tamed by lt>is formula is normalJy higher ;han that obtained by ap- 
plying the K-R (20) formula to the total test, but the difference is 
small when the test is homogeneous, Jn the case of the Advanced 
French Test (Table 19). the two estimates are .9546 and .9540, 
respectively. If a score Js a factor Jn a decision that affects the future 
educational goals and career choices of an individual, the esti* 
mated reJjabiUty of that score ahotjJd be h^gh (and the correspond- 
ing standard error of measurement iow). The standard commonly 
used In the QRE Program is that the reJiabiJity should be at Jeaat .90 
for the total score on any test. The subscores. which are not 
intended for seJection purposes, may have somewhat Jower reJf- 
ability estimates, with lower limit set at about .60. For the example 
in Tabm 19. all tt\e reJiabi J ity estimates exceed .90, The standard er- 
rors of measurement are given Jn bolh raw-score and scated-score 
oni^s. To translate the raw-acore standard error of measurement 
into scaled-score units, multiply by the conversion parameter A. 
which ^s the slope of the conversion line. The use of the standard 
error of measurement is explained in the QutdB to the Use of the 
Graduate Record Bxaminatfons. whtch is supplied fo e\\ tnstitytions 
using the QRE score reporting service^ 

Whenever a test generates more than one score for reportirjg. or 
has separately timed sections, an jntercorrelation table is presented 
in the test anaiysis report. Tabte f 9 shows such a table for the Ad- 
vanced French Test, The correlation of each subscore with the total 
score IS spurfOusiy- high because the subscore is inc/uded m the 
toial score and is a large part of it. The correlation betweeh the two 



43 



ERJC 



Table 20: Score Distributions 

Advanced French Test. Form A 




subscores is not affected in this way because the subscores are in- 
dependent at each otherJf two subtests are parallel the corretatjon 
between the scores will approach the geometric mean of their reh- 
ability estimates. Expressed in terms of correction for attenuation. 
^r./vV,, r„h approaches 1, In this expression, r^f, is the correlation 
between the two subscores and r^^ and r^, are the respective fSh- 
ability estimates. For the example in Table 19. this valuers about 
.81. For a vaJueiess than ,90 to .95. one may conctude that the two 
Subtests are tapping somewhat different traits or abilities. 

At the bottom of Table 19 are data relating to speededness. The 
guideline at ETS is to regard a test as essentially unspeeded \i at 
feast 80 Percent of the group reach the last item and it virtuaOy 
every one reaches at least three-quarters of the items. Any conclu* 
sion based on this information must be supponed by the specie* 
scoredata presented in Table 20. 

TabJe 20 shows five distributions the totaf formula-score dis- 
tribution end distributions of the number of items answered cor- 
rectty. answered wrong, omitted, and not reached. also shows a 
scatterplot, or two-way table, of the Score versus R + W. Two lines 
ap^iear on the scatterplot. a solkJ tine that passes through equal 
values of Score and R W, and ^ dashed line near the bottom of 
the plot that marks off the ' chance' area, if speed is an important 
element in the score, the ' not reached mean and standard devia- 



tion are high relative to the corresponding statistics of the scor,e 
djstributlon> and there is a noticeabte tendency for the entries in the 
scatterplot to tie close to the diagonal sofid line. If power is im- 
portent> the "not reached" statistics are relatively tow. and the 
entries in the scatterplot cluster in the right-han<^ columns. Ir> the 
example shown In Table 20, the evidence points to a measure of 
power rather than Speed. 

The dashed line near the bottom of the scatterplot is used to 
describe the efficiency of the test. \i an answer sheet were marked 
at random, the resulting score would be expected to approximate* 
zero, and the chances are 99 out of lOOinat ascoreotitained t^^lhis 
manner would iie befow the dashed line. An efficient test woutd not 
produce a large proportion of scces In this chance area. For the 
example shown in Table 20. this proportion is less than .01. 

Table 21 presents two sets of distributions, one set based on ths 
observed deltas of the reported scores and the other on the biseriat 
correlations of items with the total score. Fdr a test of middle 
difficulty for the yroup> the mean delta would be about 12.0. In the 
example, the mean delta for the total test is slightly above this 
value, ThQ difficult questions seem to concentrate in the second 
subscore. The deltas have been equated to put them on the 
reference scale established <<^ 1963 and defined by the reference 
sample of seniors selected at that lime. Delta equating is done so 



44 



that (he equated deltas obtained with a current sample can be com- 
pared directly w»th equated deltas from previous forms of the test. 
The sdjustnnent resulting how thts equatir^g is in this case very 
slight. 

NormaMy. the criterion usod in the analysis is the score on the set 
of items analyzed. The criterion for the total test would be the toiai 
score, and that for each subtest would be the subscore itself, m this 
case, the test specialist requested that the total score be used 
throughout. If thesubscores had been used as criteria, the mean hi* 
for each subtest analysis would have been higher. The reconn- 
mendation for the committee of eKamioers would t>e that it review 
carefully the seven items with biserlai correlations below .20. Such 
low values are unusual in a language test and 'Tiay indicate faulty 
sterns or items based on subject matter most students do not en- 
counter in their undergraduate programs. Although the item dis- 
tribution sheet Shown as Table 22 is not normally included in the 
test analysis report, it is used by the committee of examiners to 
judge the appropriateness of an item lhat may have questionable 
statistics. If. for example, an Ham has a low r-bJseriaf and ^ low 
delta. It produces a tally in the tower left-hand portion of the item 
distribution sheet, thus indicating that the item is very easy for the 
group of examinees sampled and doas not discriminate well among 
them- On the other hand, if the r-biserlal is tow and the delta is very 
high, this *s suffrcient evidence of a problem that warrants further 
investigation. 

In the course of examining the data on the n)o3t recent form of a 
test to prepare for the construction of anewform^the committee pf 
examiners usually compares the analysis results of the five most 
recent forms. A summary of the results for the total scores appears 
as Table 23. A similar table for the subscores appears as Table 24. 
Thus, the test analysis report is a compact bat complete record of 
the essential statistical characteristics of Ifi^ last and serves as a 
^uide for future test construclibn. 

Descriptive Statistics 

The Primary, reason for providing descriptive data is to l^^P score 
recipients interpret scores. The usefulness of a test score is 



Table 22: Item DtstHbutlon Sheet 

Advanced French Test» forrri A 
Rdw Delta 



rbit 


5.9 


60 
to 


7.0 
to 
7.9 


so 

to 

a. 9 


9.0 
to 
9.9 


LD.O 
tn 

10.9 


11.0 

ID 

11.9 


12.0 

tn 

12,9 


13.0 ' 
to 

13.9 


14.0 
tP 

14.9 


IS.O 

CO 


16.0 
1S,9 


17.0 
to 

L7.9 


tfl.O 

tP 

IB ,9 


1».0 
UP 


TpttI 




I 






























[I] 


80 .89 








I 
























1 


70 .79 






2 










1 




1 












4 


60 .69 










2 


3 


4 


4 


3 














17 


.50 .59 




I 




2 


3 


2 


5 


8 


n 


2 


6 


4 








44 


40-49 






2 




5 


4 


8 


13 


? 


7 


4 


7 








57 


30 .39 








I 




2 


6 


6 


8 


9 


7 


2 


1 


2 




44 


20 .29 




I 




1 


I 


2 


3 


2 


6 


3 


1 










20 


.10.39 








1 


1 




1 








i 




1 






5 


.00 .09 






















1 










1 


Ucj^. 
























1 








1 


Total 




2 


5 


6 


12 


13 


27 


34 


35 


22 


20 


14 


2 


2 


0 


195 



n SUM SUM or si^ums mem $.i>. 

ft,,, 194 84 48 40 244 J 0 4555 0.1335 

195 2AS\.S 32727 56 12 7271 2 4196 



45 




Table 21: Fr«quency Distributions of 
Original Deltas and BIserlal Correlations, 
by Score 



Advariced French lest. Form A 













Stun 


tUmlJic Stillt 


Clvl»if»tlqfl 


18.018.9 


2 




2 


17.0*17,9 


2 




2 


16.0 16.9 


14 


2 


12 


15.0 15 9 


20 


4 


16 


14.014. 9 


22 


6 


16 


13.0 13.9 


35 


12 


23 


1 o 1 ^ Q 

1 ^ .U" i ^ . 7 








11.0 11.9 


27 


16 


11 


10.010.9 


13 


10 




9.0' 9.9 


12 


6 


6 


8.0- 8 9 


6 


6 




7.0 7.9 


5 


4 


1 


6.9 down 


3 


2 


1 


Total n 


195 l^MiM 


92 Equated 


103 Equated 


Meart 


12.7 12.6 


11.7 11.7 


13.7 13.6 


iT 


2.4 2.2 


2.2 2.0 


2.2 2.0 


3 


0.92 






t> 


0.92 








Criterion = Totj} Score _ 


.80.89 


I 


1 




.70 .79 




4 






37 


12 


5 


.50-. 59 


44 


21 


23 


.40 .49 


i 57 


28 


29 


.30 .39 


44 


14 


30 


.20 .29 


20 


9 


11 


.10.19 


i 5 


2 


3 


.00 .09 


: i 




1 


Negative 


f — 

1 ] 




1 


Not Computed 


1 1 


1 




Me<jfi 


I 44 


47 


.40 




1 ^3 


.14 


12 



Tabre 23: Summary Statistics for Total Score 

Advanced French Test 









W 






X 


Y 


A 


Afi m r 1^ t cf r 9 fii^ 1^ 




December 1972 


April 1973 




Octnhf^r 1976 


Test Analysis Sample N 


420 




450 






360 


370 


190 


nBW oCvrc inittrm«uttn 


















Number of Items 


190 




190 






190 


1 Qn 










154 






161 


1 3Lr 




IV1 1 n 1 rfi u rfi v/ijioifi^u 


n 




3 






2 






Mcd n 


ft 1 cri 




78.35 






69,47 












24.68 










^O 1 1 




tjl .0/ 




7975 






67.83 


no. no 


Qo Kn 


ScaM* Score Information 


















Mean 


540 




534 






523 


517 


519 


S.D. 


83 




78 






91 


91 


88 


Maximum Possible 


860 




880 






900 


890 


810 


Maximum Obtained 


770 




770 






810 


770 


770 


Minimum Dbtamed 


300 




290 






310 


320 


310 


Minimum Possible* 


300 




280 






310 


320 


290 


no* of 990 Scores 


















i^urrenc norrns 


















Mean 


542 




542 






542 






S,D 


92 












88 




Itom Statistics 


















. Mean P+ 


51.8 




51.0 






45.7 


42.5 


52,7 


mean iiQ 






12.9 






13.4 


13.8 


1 ^. / 




/ 




2.9 






2.5 




A 


ivi cai 1 / \ ^ 






13.3 






13,5 




1 ^. / 








2.4 






2.3 




£..£. 


Mean r^j^ 


.38 




.35 






.39 




AA 

.44 








.14 






.14 




13 


'bis lb 


Items 




17 items 






9 items 


9 items 


5 items 


Test Statistics 


















RefiabMity 


.940 




,921 






,942 


.947 


.955 


SEfn*« 


















Raw Score 


7,00 




6.94 






7 01 


7.00 


7.06 


Scaled Score 


20.41 




21.90 






22.00 


20.91 


18,79 


Special Score Data** 


















Mean Rights 


97 26 




95.06 






86.28 


82.56 


101.07 


Mean Wrongs 


63.24 




63,35 






67 .S3 


66.67 


58.95 


Mean Omits 


26,17 




27.80 






29.96 


37.74 


33.52 


Mean Not Reached 


3.33 




379 






5,92 


3.02 


1.46 


Speeded ness*' 


















% completing test 


79 94 85 45 


86 


41 78 69 


54 


81 37 51 


63.8 


6L6 


% compfeting 75% 


99 100 99 96 


99 


100 100 


94 


99 


99 98 90 


98.4 


100.0 


Item reached by 80% 


54 55 40 40 


55 


55 39 


37 


52 


55 40 35 


189 


194 



*The scaled itore cnijFVrtl^nT to 0 .nrbif r Linly Hi^iigficcJ netfative **FofmS introduced in October and December of 1972 and in APril 
ra^ scores of 1973 con&i&t of four &eparalefy timed ^ectioriS. Tha &paciat &core 

dMa ar« ba^ed on combined information; the &peftdedn«ss datjA 
are given by section. 



enhanced when accompanied by relevant mrormatlon that includes 
a description of me test and normative and descrrptive data fhat 
Permil evaluation oi the performance ot an examinee or group of 
examinees refative to that of an appropriate norms group. The 
descriptive data for each GflE test are provided in a descriptive 
booklet made available to siudents before they take the test and to 
graduate institutions that use GflE scofe reporting services The 
statistical information, which includes reliability estimates ot 
reported scores, standard errors of meas»jrem»nt, and intercorrela- 
tions among reported scores, appears m tUeOuttje to the Use of the 



Graduate Record Exarntnations. other kinds of interpretive data 
based on the performance of groups of students are provided in 
part in the Gu/de. supplemented by the descriptive statistics in the 
following sections, 

Basic Normative Data 

Generaf. or aggregate^ norms that provide 3 bfoad basis of com- 
parison for graduate institutions consist of percentife rank tables 



46 



ERIC 



r 



Tdbte 24: Summary Statistics for Subscores 

Advanced French Test 



Form 












M 


T 




A 




Administration 




Mf 197Z 


D«ctmUr W2 


Apfil 1973 


OclAlitf 19H 


OctoWr 1976 


Test Analysts Sample N 




420 




4S0 




360 


370 




190 


Subscore 


InllrpftttH 


Ulvfilur* 








JLltinturl 


InlvrpfttlH 






Lit#rtturl 
















RMdtnt 










unit 






UVtlllritOn 


SkHIt 


vfVf lie limn 


411111 


IhI 1 11 *IJ An 


S%lltt 


kfVlliflllOll 


R«Hr' Score Inlonn^ion 






















Number of Items 


90 


100 


97 


93 


92 


98 


9S 


9S 




103 


Maximum Obtained 


83 


80 


SG 


68 


80 


87 


SO 


76 


90 


92 


Minimum Obtained 


0 


0 


4 


- 1 


^ 4 


- 2 


I 


^ 4 


3 


- I 


Mearr 


48.&3 


33.10 


61 61 


27 S4 


42 06 


27 i>3 


38.47 


27.66 


49.61 


36.94 


S.D. 




1545 


14 85 


12 23 


16 07 


IS.Ol 


17.14 


J5.60 


1797 


17.58 


Median 


60 25 


31.38 


53 30 


268B 


43.06 


25.50 


3a. 75 


26.36 


50.50 


34.88 


Skewn«ss 


high neg. 


hisb pos 


high neg. 


high pos 


htgh neg. 


high pes. 


{-) 


high pos. 


mod. neg 


high pos. 


ScJltd- Score Information 






















Mean 


54 6 


53 0 


53 4 


53.4 


52 3 


S2 3 


5i.5 


51.8 


52.0 


5i,9 


SD. 


8! 


90 


7 8 


7 8 


9-1 


9.1 


88 


9 3 


8.9 


8.7 


Maximum Passrble 


76 


92 


77 


95 


81 


95 


81 


92 


73 


86 


Maximum Obtained 


73 


30 


n 


79 


74 


88 


73 


81 


12 


79 


Minimum Obtamed 


29 


34 


28 


36 


28 


36 


32 


35 


29 


34 


Minimum Possible 


29 


34 


26 


36 


28 


36 


32 


36 


28 


34 


Item Stitistics 






















Mean P + 


63 9 


Jin c 


63 1 


380 


55 S 


36.5 


49 9 


35 2 


62.9 


43.4 


Mean Ao 


Jl S 


13 9 


n 7 


14? 


124 


14.4 


J30 


J4 5 


n 7 


13.7 


SO Ac 


24 


2 3 


7. 7 


24 


23 


2.3 


2.3 


2 2 


2.2 


22 


Mean A£ 


l?.3 


14.6 


\2 2 


14 3 


12.6 


14.4 


13.2 


J4.8 


11 7 


13.5 


SO At 


24 


23 


2 3 


20 


2 1 


2.1 


24 


2 3 


20 


2.0 


Mean fbis 


.44 


40 


42 


35 


.43 


.38 


^2 


42 


47 


.40 






.12 


1& 


11 




.12 


13 


13 


14 


12 


'bps < J8 (veiy low) 


4 ilems 


3 items 


6 items 


7 ilems 


2 ilem> 


3 Items 


3 ilemt 


4 Items 


2 ilems 


4 ilems 


Criterion Score 


IRS 


LAC 


IRS 


LAC 


IRS 


LAC 


m 


LAC 


TOTAL' 


TOTAL' 


Subscote St«trst»c3 






















Reliability 


.901 


m 


889 


842 


.902 


.894 


.912 


.905 


926 


916 


S^mMi' raw score 


4.85 


5.05 


4 96 


4.85 


5.03 


4.88 


5 09 


4 81 


4 39 


5 09 


scaled score 


2.53 


2.94 


?.61 


3 08 


2 86 


2 96 


2.63 


2.87 


241 


2.53 



*Th* i«s} aevelupment examiner reguestecJ iTirS crpterpon It the apf^rotjrpBle suDscores had been ostd as Cfp1er*a Ihe mt&n p (jrj '**oiM\ bfltn hpghtr 



Tabte 25: Frequency Dlstrtbutions for Att 1975-76 
Examtnees Who tntended to Ma}or In Mlcroblotogy 




CHi,*r.rtr 



based on the performance of all examinees withm a recent thr^e- 
year norms penpd. From 1967 1o 1977 this type of normative in- 
formation was the only type provided in the GRE Gurde, and the 
percentile ranKs that appear with the scaled scores on the score 
reports are taken from these tables. 

The three-year norms have limited value for most graduate insh- 
lutiOns. For most users of GRE test scores, the need for identifying 
an applicant's standing relative to an appropriate reference groop 
is better salisfied by developing loca^ norms based on the institu- 
tion's own data. In an effort (o satisfy (his need and to encourage i:v 
stiluuons to accumulate local norms, the GRE Program now sup- 
plies more detailed information for score interpretation in (he form 
of summary statistics reports based on score data of the most 
recent reporting year. At the end of a reporting year, each institu- 
tion that received score reports during thai year receives a sum- 
mary stalistfcs report showing frequency distributions based on ap! 
scores reported lo the institution in that period, with a count of the 
number of scores for each test and w/th tfve mean and standard d*?- 
viation for every distribution based on at least 25 cases. Each de- 



Tabl« 26: 1970*71 Examinee Volume for the 
Advanced Teste, by Educational Level 



Till 












HumlHr 
of 




Jr. 
iMr 


fUctHJor'i 
Sr. Cr«4, 


lit Ir. 
Grid, 
LmI 


2nd Tf. 
Lml 


Anthropoiogy 


1 




CI 


1 n 

ID 


6 


S 


M»0 


Biotogy 


I 


3 


ni 


'jn 
l\j 


S 


7 


13.496 


Cheitiiitry 


2 


3 


bh 


\ -J 


7 


7 


5.126 


Economics 


'} 


1 


61 




9 


6 


4770 






I 






28 


14 


24.179 


Engl nee rmg 


. b 


3 






11 


6 


7,858 


French 


\ 1 


3 


65 


70 


7 


4 


2.472 


Geography 


' i 


2 


53 


21 


12 


10 


962 


Geology 


■ 1 


3 


64 


15 


9 


S 


1.636 


Gerriian 


1 


3 


64 


IG 


7 


6 


702 


History 


, 1 


J 


62 


21 


3 


4 


1C.637 


Literature 


1 


2 


60 


22 


10 


b 


14.079 


Mathematics 






84 


16 


9 


6 


7.131 


Music 


1 3 






2A 


12 


3 


2.503 


Philosophy 


i 1 




65 


18 


8 


5 


1.570 


Physics 


1 ' 


3 


6/ 


J2 


9 


7 


3.907 


Political 
















Science 


i 


3 


62 


21 


8 


5 


5.3U 


Psychology 


1 2 


4 


£5 


IS 


7 




17.578 


Sociology 


1 i 


3 


67 


Id 


6 


4 


6.485 


Spanish 


1 2 


3 


60 


20 


10 


6 


1.739 



GHE Gutile was the tirst to incto(lo this kind of information Thess 
norms tables are probably rr.'^re appropriate than the three-year ag- 
gregate norms for evaluating Kie performance of applicants for ad- 
Tijssion to graduate schools 

Descriptive Ststistics for the Aptitude Test 

In the preceding discussion of the development of the GRE scaled- 
score system, the fact that students selecting the various ma^or 
fields show, on the average, different levels of developed abilities 
was well recognized as evidenced by its incorporation into the GRE 
scaled- score system. It is irrtportant that the rrtagnitudeof these dif- 
ferences be rrtade known to score recipients who use GRE Aptitude 
Test scores as part of applicant information for making selection 
decisions. The descriptive data summarized in Table 27 are based 
on the sarrte group described in the preceding paragraphs: seniors 
plus nonenrolled college graduates tested between 1974 and 1976. 
The table ts based on examinee response to the background in- 
formation question on undergraduate major field and shows Ap- 
titude Test performance for each major field category. The major 
fields are grouped into Ihe four main categories: humanittas, social 
scences. biological sciences, and physical sciences. Each of these 
is further divided into subgroups based on a logical structure: lan- 
guage versus nonlanguage majors in {he humanities, educat^on^ 
history, busi ness-commerce-communicalions in the social 
scient^es: applied versus basic science in the biofogicai sciences, 
and mathematics versus physical sciences in the physical sciences 



Partment withm the institution receives a similar report ^ased on 
scores reported to that department A third set of distributions, 
based on all scores reported to all msntutions, presents distribu- 
tions for each intended maior field group A sample page from this 
third set. Presented as Table 25. shows the frequency distributions 
for ah 1975-76 examinees who intended to major tn microbiology 
In addition to graduate institution summary statistics reports, the 
GRE Program supplies the same Kinds of mformation to under- 
graduate institutions Each undergraduate institution report is 
based on the score information for examinees who identified that 
institution as the one m which tney did their undergraduate work. 

Although tt was known for soma time that GRE examinees are not 
restricted to seniors appiymg for admission to graduate schools, 
there was a general assumption that the proportion of examinees 
not included in this category was rather Smalt In 1970, Chaur C. 
Chen analyzed examinees responses to background information 
questions for the Advar^ced Tests and found that the proportion of 
Advanced Test examinees classified as seniors was lower than ex- 
pected, averaging about 60 to 70 percent. A similar study t>ased on 
data from the following year 11970-711 was completed by Frances 
Swineford m 1971 Table 26 summarizes the examinee volume 
analysis in the report of the latter study Of particular interest are 
the data for the Advanced Education. Engineering Geography, and 
Music Tests 

In light of this information, questions were raised at)Oul Ihe need 
TO provide more clearly defmed norms groups The first percentile 
rank Tables based on a normative group restricted to seniors Plus 
nonenroll^^d college graduates who had no graduate school 
experience and who took the rests m 1974-"^5 were published a 
supplement to the 1975-77 GRE Guide The 1977 -78 edition of The 

o 

ERLC 



Table 27: Aptitude Test Performance of Seniors 
and Nonenrolled College Graduates Classified 
by Undergraduate Major Field 

(Tested at Nationai Administrations oetween 
October 1. 1974, and June 30* 1976) 



HUMAniTIES 





\ I 




lUilitt 


QiiMrjt*i4Ti A^lllr 


















3..b«.A,P * 






0*ii«li9ft 




0*«Jtli*n 




linguistics 


'■ ^)?? i 




l?6 




m 


19 


()TrieT rore'Rn LjnSu»ltF\ 


1 1 




I2h 


451 




?o 


Ci^ssicji LififttiAffi 


G3e ! 


63S 


[07 


5iG 


HQ 


.'^ J 


t rT|liVh 






11? 


499 




6/ S 


f rf nch 


: }l\ ; 


i60 


10^ 


bOS 


IDS 


9 B 




1.(5^1 1 




][l6 


b33 


116 


J B 






GO If 


HE 




U& 


\ } 




. ? 39^ , 


^13 


120 


4C9 


\V 


fl6 


fit [ iMcrn 1 #n^u»k(l 


m 1 




1 IB 




\n 


\ 1 




, l?4 


see 


\3Z 






04 


lEdli^r: 


103 E 


yj3 


\23 


474 


107 






f 












Archffiioj^ 






tl8 


5J9 


117 


09 


ArcniiftiMi^ 






HG 


59? 


104 


6& 


hue Ans 






IJ' 


471 


1U 


S 


Mu\iC 


' ^Hlb 1 




114 


bOb 


118 


17 1 




. 3.304 ! 




tl? 




i:: 


9 i 








n; 


bl4 


m 


9 ; 








99 


461 


107 


114 




?.ioo I 


S69 


Uf/ 


501 


111 


U ." 


C ruti 0i r a1 iv^" I IT*? Tar nfr- 






It/ 


Ml 


119 


; 4 


[>r(ifTiaric Ar^^ 








inf 


118 


G 


nu'i" H(,rTH(HTi(;^ 


1 6.001 , 


491 


i:jq 


4G0 


I2b 


17 & 


1 Si.tifMiiiU A 


?! ?,M 






■lO? 


116 


^.t 




' J* ■. 


My 


i.*rr 


i9<3 


III 


-\ I 


MljMANHlFS 


[ 61 9JG j 




1/] 




m 




















Tabte 27 



SOCUL SCIENCES 



T 


■ 
























4 


















n 












EtftftBlionil Adniini^tTilion | 






LJ4 


417 


n; 


01 


Educitton ! 




45Ji 


103 


4&0 


113 


7j 1 






470 


9J 


460 


ni 


4S 


GuiOtnct tftd Counitlmt 




4')/ 




434 


119 


04 






in 


] IS 


497 


IIB 


0 1 


Socitl PiTCholotV 






MB 


"109 


171 


0 / 










S73 


IL6 


4/ 6 








IDS 


433 


114 


4 7 




ll.?flC 


499 


ns *69 


177 


14 3 


>iiMr>»Ti Studit^ 


L.OJl 




]t)B 


Slfi 


114 


30 








9* 


s«a 


171 


04 








lOH 




111 


5 J 








103 


57S 


in 


116 




13 its 


^9 


lie 




1?1 


404 




1 




110 


S3 4 


llfr 


37 


Uw 


394 




n9 


456 


134 


1 1 








1 13 


506 


171 


34 1 








loe 


49^ 


! 13 


S < 


Buii04ii tnd ComfTrfiC-^ 




4'.S 


10/ 


!)3] 


173 


L6 8 


CommunhCttiofii 




^(^] 


ll]9 


4S4 


116 


S 6 






S7; 


173 




119 


167 


|ntfui(ri4F Atlvr.on^ 




49? 


in* 




116 


09 


Librtnr Sc>tnc4 




4H^' 


111 


447 


10/ 


1 4 


Pub^K A<ln>iniilrtliQft 


6ie 


\n 


111 


4flS 


IIS 


1 S 






SI/ 


111 


S3fl 


174 


7 3 


Othir SoCidl Sclt^:t^ 


?0 tM 




\ii 


4S1 


176 


4914 


Tout Subfrdup ' 






u: 


491 




" V[ 4 


lott\ ^uDlrdUP 0 ' 


14 35A 




] IS 


510 


i?i 


77 3 


Tdtdl SublrouP C 1 


404ri 




H9 


4S6 


135 


. 76 ) 


social SClENCtS 








49<l 


""" |iV" 


J 43t> 


BtOLOClUl SClEMas 














iu>H r)«it 










~i 


















n 




b*fiirl4n 








PhtrmKdldiy 




4J»^ 




SI9 


144 


^ 0 J 










i/6 


ly? 


2A 






444 


St 


4/G 


177 


1 J 


Mttdrnv 




'A(\ 


] '4 


S6/ 


us 


0 3 




713 


4b'H 


<fa 


S3S 


113 


I 4 












n*^ 


4 4 








9(i 


4;R 


101 


47 7 


OccuPtliOntt Th^i^tpt 




'jI? 


91 


SOS 


loe 


1 5 




19 




146 


^9,1 


176 


0 L 




1/ 




91 


^3(l 


n 


01 






4^K 


n\ 


^t»() 


93 


46 




46*] 


199 


94 


S.lfJ 


100 


3 1 


Phyjidlo^ 


43; 


SJ4 


[06 


607 


108 


?9 




in 


4^6 


1/ 


490 


171 


78 


Vtl«rtntry MtdiCin^ 




49 r 


96 




101 


3S 




I^^7 


447 




468 


IK 


1 0 




1,4(S 




his 


',?7 


iin 


94 






4bl 


9fr 


46B 


JOG 


L7 1 




7fi3 


•j;o 


107 


MH 


97 


06 


Micno^EdlDfy 


J 904 






SflH 


9/ 


4 1 


Ptrtutalofy 


37 


4'.1 


ni 


466 


11*1 


0 1 




\OMJ 


4H] 


111 


S7Ji 


. 171 


73 7 






4',? 


\riJ 


S4] 


106 


S S 




i7e 




IOh* 


S/4 


IC; 


0 ^ 


Bi«hffniuJ> 


1.609 




lOK 


6SS 


98 


?s 




:? 


v7^{ 


lOb 






4r « 


BiDf)hy'»ir'+ 


IM 




104 


6/6 


107 


0 3 




i GSl 


Ml 


IftS 


^9] 


107 


71 




750 


VM 


;[>9 


%/n 




06 






'/a 


96 


,99 




7 1 




i,3?9 


'.if 


99 


'i^U 


it 


9 3 


Tciuc ^iinarrjiip A 


V; ijm 


Mi' I 






u : 


•i 1 


Tmif ^ub^roup B 




'■■\'> 






n I 


/^ } 


■ ■ 1- 

BlOlOCrC^l SCIENCES [ 


!il.C&S 






S^)4 




\^ : 



(con!,) 



PHYSICAL ^1EW;ES 



























tirtHIt 




H 






■tin 






ApPlltCt MAfhtmtthCl 


5/6 


579 


174 


6B3 


96 


51 




709 


4S9 


1 1Q 


664 


102 


LB 


Mtlhtmtnct 


6r591 


575 


'7 J 


671 


lOL 


75 7 


CciiTtpurti SC4F>Ct 


1.9/5 


5)0 




673 


99 


17 4 




- .. 

4r077 


449 


171 


S94 


113 


Lt 3 


AVNOoOFlly 


7 It 


60 1 


104 


683 


lOL 


06 




ra9i 


*)?9 




647 


100 


22 2 


f njrnttnng. A^Tontuticil 


534 


51^ 


105 


677 


65 


1 5 




I.?l9 


487 


130 


678 


91 


4 6 


Ln|iitc«rin|. Civil 


7.7n 


47L 


IL5 


663 


93 


76 


En|ir>ftnnfH ElHlrtttI 


4.360 


461 


J33 


674 


35 


12 ) 


f OgiHtliOgh Intfujtrdh 


596 


447 


125 


636 


104 


1 7 


Enlin«erm|. Mechtmctl 


7.J49 


467 


I7S 


666 


33 


66 


E rigiof 4f lOt, Orhtr 


2X177 


503 


113 


672 


97 


59 


Gnlatr 


i.B»9 


537 


105 


60« 


too 


no 


Mtldllur^ 


700 


471 


1)4 


664 


35 


06 


Mining 


)6 


430 


L40 


600 


iLS 


OL 




366 


499 


101 


679 


107 


1.0 




4,440 


559 


L23 


694 


92 


12 5 


fdtti Subgroup A 


LU53 


5?6 


121 


672 


LOO 


744 


Tolil SubtrouP B 


)5.486 


50) 


L2S 


6^^2 


104 


76 6 


PHYSICAL SCIENCES 


46.B41 


509 


L25 


6S6 


103 


1)1 


OIKER 


UrtMl^lf^Hit* 








Qwnttbthp* AWIktir 
















































n 




bnllHMi 


14441 






Olhtr 


11.73) 


440 


1?S 


4/1 


L3S 


338 


No RttpOn» 


7L.96G 


491 


12B 


510 


133 


66 7 


Jfiii\ Subgroup 


33,199 


4/4 


1?9 


497 


139 


93 


All Stnior^ inO Nuntn 














iDliftf Cd«tRt Oratfuatt^ 


35/.570 


soe 


170 


S78 


13) 


1000 



Table 29 provides the same kinds of rnformahon based on 
responses to the background informatton question on mtended 
graduate major field. The purpose of both tables is ^o point out 
the magnitude of tho range of means for both verbal and quanli- 
tarive abifity scores and the reJatiorrshrp'i between Aptitude Tesf 
performance and eclucationat experience an<l educational goals. 

Because normative clgta on Aptttude Test performance by 
intended graduate major fiefd are important in score interpretation 
for the purpose of selection^ grouped score distrlbutfons by 
intended major field are now Included in the GRE Gu/tfe. The score 
intervals used »n these distributions are rather large, but the main 
statistical properties of the distributions can be observed. 

v. .,en this technical manual went to press, the only normative in- 
formation available for the new analytical measure was based on 
the data obtained in the first administration of the restructured Ap^ 
titude Test (October 1977). Although the examinee group of that 
administration is high scoring and includes a relatively high propor- 
tion of fellowship applicantSn the relationships among the verbal, 
Quantitative, and analytical scores for the four undergraduate major 
field categories may provide a useful guide for interpreting scores 
on the new measure. A brief summary of this information is given in 
Tabte 29. 

Other Factors (nteracttng with Aptitude Test Performance 

Oiher factors related to Aptftude Test performance have been 
examined for th<s same norms group Three of these are sum- 




49 



ERIC 



Table 76: Aptitude Test Performance of Seniors 
and Nonenroiied College Graduates ClassHled 
by Intended Graduate Ma|or Field 

(Tested at National Administrations between 
October K 1974. and June 30. 1976/ 



HUMAMlflCS 







1 Vtrbl 














■hi*' Ft*i4 




























t 


tl4ar4 1 


taiufll 41 






1 












■ 










■bH D 




* f 








1 bae 


133 






553 


lit 


T 65 


Olhff Fon-in l.«nlLj«S«^ 


333 


! 4^3 


L36 






ua 






? ? 


Clm»c«l l.0r,iu«c«^ 


380 


; 647 


105 






560 






26 








109 






505 


135 


1 


62/ 




1.210 


1 b52 


109 






493 


106 




82 






1 b67 


10/ 






5? 5 


IL2 


1 


1 6 




3ia 


' S02 


116 






547 


118 




2 1 




' 1 206 


' 493 


\2l 






4^0 


1 1 7 




SI 




i 336 




104 






MJ 


L15 


; 


23 






1 602 


L14 






54S 


177 




1 2 


t1«li«n 


66 


M2 


m 






445 


100 




0 4 




569 


m 515 


110 


■ 






3 0«B 


S12 


11^ 






^MM) 


lOt 




9 3 


FifK Arts 




1 494 


M4 






46 b 


114 


* 


9* 


. Mil IK 


*.923 


■ bis 


M5 






507 


119 




14 9 


^■I0t4|^^> 


2.0(ft 


, 603 


109 






567 


1 2? 


^ 


6 1 




b)05 


5?; 


113 






524 




1 


15 8 




3.2 JO 


, 


IOC 






457 


] 0 ; 




93 


ftft Hiitonf 


i sa9 




10/ 






490 






5 5 






L 6]f: 


111 






^iTl 


1 i ^ 

L 1 1 




2 4 


OrtirahC Arti 


. ?.2ti 


' 


Ml 






494 


120 




63 


Other Hunitnrri*^ 


S.69? 


; 49/ 


127 




46; 


;24 


-A 


1 7 3 


Tot«l SubSrOul^ A 


; U.33? 




117 






TTa 




31 1 




i 32,WS 


4 


119 






504 


1?4 




6B9 


HUMANrTlES 


I 






S04 


122 




13 4 




SOCIAL fCIEJ«C(B 












































1 




-- 






N 














m 








1 4S& 


107 






476 


121 




34 




2r.4?0 


4/| 


][); 






4;? 


3 16 




29 4 


Phri<t\ CtfiiCftiop' 


? 6se 


i 4ZS 


45 






4/0 


1 




J * 




6 968 


4/U 


1U7 






466 


1 16 




9 1 


Educ^hanil Pi]rchoiQft 


2.4*/ 


so; 


10^ 






>uo 


1 U 




3 2 


Sofifl P^Tt^oioSr 


1.091 


i 


il? 






506 


120 








:b 42S 




lOB 






525 


118 




34 7 




/sso 


485 


Ml 






4^!a 


U/ 




10 3 






!?! 






47S 


128 




49 


A(r«riC«n S<udti;\ "i^fJ i>/& 


to/ 






514 


115 




— ' — - 
2 7 


Slav^j: Sludie^ 


is; 


507 








556 


113 




09 




I.3M 


S14 


no 






bii 


113 




6 1 


AmhrapDioCf 


2;33 


s/r 


105 






519 


M 5 




12) 




67 W 




lis 






502 


123 




3J 5 


inrer national Re^^Mons 


2.310 




U4 






524 


121 




13 6 




^.210 


S7f) 


I2J 






521 


135 




10 3 


Cnv«rn(n4Tpl 


4 /;0 


S26 


I/O 




r 


503 


1^6 




22 2 




2.m 




103 






49/ 


116 




52 


Bui mm tn<] Comirerce 




4ia 


Ul 






556 


122 




12& 




2.0 10 


' SIO 


112 






m 


12(5 




^4 


FCOfWmici 


4.74/ 


I S26 


m 






60/ 


120 




85 


induirrpii fteiihcni 


L2LU 










516 


ns 




? 2 


; library Schme 


6.63? 


bsl 


112 






431 


112 




17 0 


PubrK Atfrni^ti^trilton 


b.M3 


1 495 


no 






493 


121 




10 0 


Urhan Dtvilapment 


3,69J 


1 


111 






535 


124 




66 


orner Soc<*f ScieriCfs 


70. /a 1 




117 




r 


458 


12s 




373 


Total Subtr«up 


7S.L60 


[ 496 


113 






490 


121 




49 J 


TotiF SubCiOup B 


21 4S3 


' 54S 


Mti 






5U 


U* 




14 0 






170 




i 


500 


130 




363 


SOCIAL SCtENCE'J I 1^37! 


r -^^ 


11/ 




1 


49' 


125 




42 9 



50 



smiOClCAL SCtCMCfS 







Vai%4l 




Qiupriltvtia* 


AMHt) 




























... 












1 




tarcanr ** 


Ivttw*^ a 


** 


man 


Ot*lab«fl 






.'^'!!!?.. 




1,01)2 


52 > 


109 


609 


102 


19 


Au<tpalDfy 


1.06/ 


4BB 


95 


483 


104 


ii 


H<i^Pirai Adf^ilPiTraLion 


2.193 


485 


104 


517 


U9 


7^1 




633 


50B 


98 


556 


105 


23 


Dennslry 


474 


47/ 


101 


560 


1)7 


1 J 


Med>cin« 


2J5Z 


544 


106 


609 


110 


} } 


Nursing 


6,130 


507 


97 


481 


lOi 


22 0 


OccuPaTioni' iheriPy 


iJ9 


501 


ion 


493 


110 


19 


Oplirnetr^ 


100 


497 


95 


5B5 


107 


04 




SB 


527 




55S 


101 


03 


Pharmacy 


56« 


462 




571 


107 


20 


Ph>iical Therap* 


i.n^ 


481 


98 


326 


103 


6 1 


PhyjwlogTf 


2.0?B 


529 


104 


591 


104 


7 4 


Public Heatlti 


2h(21 


512 


112 


529 


120 


94 




2h4)6 


523 


100 


5BJ 


98 


89 


Paifto4otr 


704 


490 


106 


543 


119 


25 


Nutnhon 




407 


lOB 


526 


U2 


6 4 


Hom^ Economics 


J. 498 


444 


98 


463 


10^ 


5 4 




l.lBl 


5&7 


101 


616 


97 


3 4 


Microbiology 


3.379 


513 


105 


574 


106 


9a 


Psraiilolo^ 


150 


510 


114 


544 


109 


04 


Olfter BiOIOjical StienCej 


13.B19 


500 


112 


549 


121 


400 


Agnctjtliire 


1,993 


455 


112 


541 


109 


5a 


BKtcrwNnff 


443 


493 


102 


54B 


110 


1 3 


BiOc hcin Fi-rn^ 


3.027 


547 


109 


640 


101 


B B 


Bioiosy 


4.509 


52 J 


117 


574 


114 


13 0 


BiO{>hriic^ 


337 


565 


115 


664 


lot 


10 


841 inr 


L379 


551 


103 


588 


101 


40 


EnToniolo£¥ 


5?3 


511 


106 


565 


109 


1? 


FoiBtrr 


1.04i 


509 


97 


532 


LOl 


30 


?{)Olosr 


2J72 


545 


9« 


531 


38 


60 


lolal Subfroup M 


27.a9? 


504 


105 


533 


US 


44 6 


Total Subgroup 0 


34.5S3 


514 


lU 


572 


U6 


55 4 


BIOLOGICAL sciences 


62.4eO 


509 


109 


554 


llB 


17.5 


PKTStCAL SClENCfS 


iHt*ft#*4 Crtbad 














Mai** FbM 


































1 








N 






•*H 0 






Applies} Mathematics 


n3 


537 


llB 


682 


1 

100 


1 

85 


Sratiihc) 


616 


514 


IIB 


B79 


IQl 


71 


Marhematici 


3.394 


527 


130 


676 


105 


39.2 


Com(Julef Science 


3.922 


523 


128 


663 


100 


45 3 
















Orher Phriiul SciefKC; 


4.258 


460 


124 


602 


121 


13 9 




423 


595 


106 


683 


95 


1.4 


ChtmitTry 


4.; 65 


524 


118 


650 


100 


1^6 


Ergirter;nt. Aeronauhcat 


460 


507 


no 


672 


91 


1 5 


EnSmtenrX. Chemica> 


1.604 


479 


13C 


672 


96 


52 


FniEineermS. Cml 


2.479 


467 


U6 


652 


99 


a 1 


Enlirieennl. Iltctr»cal 


3,6B4 


4B2 


133 


676 


95 


120 


EnSifK^nnS. irKluirnal 


301 


437 


126 


636 


103 


2.9 


EnSinfienng. MKhanical 


1,>37 


461 


128 


666 


93 


5.7 


Engineer rng. Oiher 


2.396 


505 


U5 


667 


9B 


7a 


Gtolocr 


1615 


530 


104 


€04 


t02 


iia 


Metal turSy 


220 


479 


136 


885 


95 


07 


Mtnmg 


45 


493 


112 


615 


in 


01 


OceatiQfliaPhy 


1,040 


521 


104 


620 


100 


34 


PhyKCS 


3.011 


562 


125 


702 


87 


98 


Totat Sutf^up A 


B,665 


625 


12? 


674 


102 


22 0 


Total Subs roup B 


30,63« 


500 


^26 


649 


106 


78.0 


PHVSICAI SClENCCS 


33,303 


506 


126 


655 


105 


11 0 


OTHER 


liilaniiJ Onduti 




•t^l **il«* 


Of^nnlarVrt 


mutt 




Ma|w ri«W 










fmtttnj of 


















■ 




Otvlalle* 


■aan 1 




C4la4«^ 


f>t»ier 


10.526 


449 


123 


470 


]23 


19.3 


Undecided 


10.487 


W1 


123 


S37 


132 


19 2 


Uo ftetponie 


33,621 


500 


127 


521 


137 


61 S 


TnMI SuhlrOu;> 


^1534 


494 


12t! 


514 


137 


15 3 


AH Senion an<] Hontn 














rcillcd Ccil^ae Graduates 


357.570 


SOS 


120 


528 


133 


1000 



r '1 



T»bl« 29: Summary Statlttlct for the Aptitude Twt 
Perfonrance of Sonlori and NoMnrollad 
Collega Qraduataa Taated In October 1977, 
Claaalfled by Undergraduate Ma|or Field 





-I 








Jk»h4c«t J^rtv 




n 




Haul (h*t«nH 




Htiiunrtiti 










lis 




IN 
















\70 










»3 


107 




110 






W7 


121 


G73 


in? 




114 


TOTJU. CROUP* 




5?3 






127 




— 

119 




1 3 •±«m.n««« -m 













Table 31 : Aptitude Test Performance of Seniors 
and Nonenrolled College Qraduates, Classified 
by Citizenship and by Primary Language 







Vtftoi Afeiiit) 








H 






T*t*l |]r«i,r 


ClTlJtNSHtP ' ' 














Atneiiuri 


3ie.&93 




US 


52» 








2L5» 


400 






U2 


6Q2 


No Re^QnK 




4« 


}27 


bl& 


13a 


*Hl 


PRIMARY ONGUJ^t 


















Siri 


116 


5?3 


132 




Othvr 


7i.esi 


W 


]» 




U2 


e.iz 


Nd ftetpODH 




*^ 


127 


S14 


t3« 


5.9Z 


TOTAL GROUP 






120 


&2J 




100 



Table 30: Aptitude Test Performance of Seniors 
and Nonenrolled Cotlege Graduates* Classified 
by Graduate Degree Obfectlve 





1 

] 

■1— 


■ 




Awilft 








-J 


Hurt DnIfrtH 


f*t*i Wmv 


Nomltf rH study 








l» 


^11 


142 


I J3 




1 


196 923 




114 




129 


SS07 


(MA. M$, M E4 trc > 




















?*9G 




lU 




\27 


? la 


{^tt 4) S(>eci4iiti) 




















111.333 


M4 


lit 




130 


31 1* 


tPtt 0. Cd n . tic ) 
















pDStttocioitF ^jtudr 




9.721 


; se9 


114 


S93 


\2i 


2 n 


Ho rB3lMn» 


4- 




I 


13) 




140 


7 63 


TOTAL GROUP 


1 


3^7.!i;0 


T W8 


120 




133 


100 



marized tn Tables 30 and 31- graduate degree objectives, 
citizenstiip' and pr^ma/y language. These tables are less useful than 
those described \r\ the precedinQ paragraphs because they 
consider each factor independently of the others, even thoi^gh 
there is likely lo be a complex ir^teract^on among all factors. Of 
interest is the proportion of examirvees in certain categories. For 
example, more thar» 6 percent are r»ot American cltizer»s. and more 
than 6 percer^t have indicated that they communicate best In a lan^ 
guage other than English. 

Addftiorkdl information d esc rib tng the GRE population tested dur- 
ing the acaSemic yaar 1975-76 is provided in a GRE report by 
Robert A, Altman and Paul W. Holland. A Summary of Data 
Cottected from Graduate Record £xam/nat>orts Test*Takers During 
J975-76, Oata Summary Report #1. March 1977. Educational Test- 
ing Service^ 



References 



Chen. C. C. Gr&duate Record Exerninations Advanced jQsts: Sum- 
mary of responses to background informatton questions. 
October 1909 to Jufy ^970 administrations (Statistical Report SR- 
70-99). Princeton. NJ.; Educational Testing Service. i970. 

DresseL P. L. Some remarks on the Kuder^Richardson reliab ty 
coef f icient. Psyc/70mofnJira. 1940.5, 305^310 

Kuder. G, F,. & Richardson. M. W. The theory of th^j estimation of 
,-^test reliability '^syc/jOfTierrr^a. 1937,2, 151-160. 

Levine- R. Equating the score scales of alternate iorms 
administered to sampfes of different abilities (Research Bulletin 
RB-55*23). Princeton. NJ.: Educational Testing Service. l955. 
fSubmitted as a doctoral thesis. Syracuse University. 1955.) 



Schutd:. M. K.. & Angoff. w. H. The development of new scaies for 
the Aptitude end Advanced Tests of the Gradt/ate Record Exami- 
nations. The Journaf of Educational Psychology. 1956. 47. 285- 
294. 

Swineford. F, Graduate Record Examinations Advanced Tests 
Summary of responses to background information questions. 
1970-71 administrations (Statistical Report SR-7M08) 
Princeton. NJ.: Educational Testing Service. 1974. 

Swineford- F. An assessment of the Kuder*Richardson formula (20) 
reliability estimate for moderately speedup tests. Unpublished 
report. Edi^cationai Testing Service. 1 973. 

wailmark. M. A rescafing study of the Graduate Record Exami- 
nations Advanced Tests (Statistical Report SR.69*4). Princeton. 
N.J : Educational Testing Service. 1969. 



5:' 



Chapter 6 

VALIDITY OF THE GRADUATE RECORD EXAMINATIONS 



Of t!i9 various f^^Braclerlatlcs of a teat, its validity ia often the focus 
of gremiMt inMr««t. Vet th« term is ambiguous; it can be interpreted 
in * variety of waya. Defined simply^ validity isthe degree to which a 
teat raflectatha trulh about th« characteristics of the person whose 
traJta it purports to measure. Thus, validity is as much concerned 
with what a test claims to measure as with the means by which it 
meaauraa. 

The QRE Aptitude Teat is defined as a measure of developed 
verbal quantitative, and analytical abilities. These abilities are 
scholastic in nature and broadly applicable. Th^ are assumed to 
be the product of the interaction of personal characteristics and 
experience and to be r«late<l lo achievement in activities requiring 
thoae skills. The Advanced Tests are designed to measure an indi^ 
viduai's mastery of a gfvan discipline, which may be deftnecf rather 
specifically in tsrms of the typical undergraduate curriculum or the 
usual expectations of graduate students in a field What a test does 
nof measure may alao be usefully identified ^n assessing validity 
For examPler the Aptitude Test cannot be said to be valid for 
measuring creativity or "raw" intelligence because it is not 
designed to measure these traits. 

ft could rJghtly be saJd that the GRE Advanced History Test, which 
covera American and European history, is not a valid measure of a 
student's knowledge of ancient Chinese history. Even if alt students 
who do well on the Advanced History Test also score high on a test 
ot ancient Chinese history, the Advanced History Test does not pur- 
port to measure thai particular domain of Knowledge and cannot be 
valid for that purpose. VaNdity must be judged in the context of the 
purposes of a test. 

The ways of assessing validity are generally expressed as dif- 
ferent kinds ot validity. Because these Kinds of validrty are reaMy 
ways of articulating questions about how weH a lest measures up to 
its claims rather than clearly defined aspects of a concept, they can- 
not be viewed as distinct and unrelated, for example, ''content'^ 
and ''construct" validity, though differentiated as terms, may not al- 
ways be extricate rnsrtuatfonsfn which validity is bemg explored. 

The k1f>ds of validity mosi frequently referred to are content 
validityr construct validity, and criterion-related validity. The defini- 
tions of these terms quoted in the following paragraphs are taKen 
from Sfanda«/s/or fducaf/cnay anc/^s/cnoyog;cay resfs (American 
Psychological Association. 1974). 

"Evidence of content validity is required when the test user 
w^anes to estimate how an indfvfdua; performs in the universe of 
situations the test ^s intended to represent. Content validity is most 
commonty evaluated for tests of sKill or Knowledge; it may also be 
appropriate to inquire into ti^e content validity of personality inven- 
tories, behavior checklists, or measures of various aptitudes ' {p. 
28). Content validity has special relevance to the Advanced Tests 
Since these examinations must represent subject fields accurately 
and produce appraisals of knowledge that are fair regardless of the 
fact that undergraduate curriculum^ vary from institution to institu- 



natrons Protiftm fWiHmghA'n. t976> iinfl PrflditTpng Succ^*^* p"^ O'flrtujitp FdutiiSitJn 
iWhJirflhjm. ^97 A} 

52 

o 

ERIC 



tion. Howeverr the question of content validity is also appropriale to 
an evaluation of the Aptitude Test. 

Construct validity concerns the relevance and legitimacy ot the^ 
skill domains being tasted. 'Evidence ot construct validity i$ not 
found in a single study; rather, judgments ol construct vaHdity are 
based upon an accumulation of research results. In obtaining the 
infOrmaHon needed to aslablish constatct validity, the investigator 
begins by formulating hypotheses about the characteristics of 
those who have high scores on the test in contrast to those who 
have low scores. Taken together, such hypotheses form al least a 
tentative theory about the nature of ihe construct the test is 
believed to be measuring ' (p. 30), In considerable part> tha 
construct validity ot the QRE rests upon decades of psychometric 
research. Indicating the sorts of ability that Play a critical roie In 
most types of intellectual work, and upon even more extensive 
educational experlencer indicating that frequently the best predic- 
tor of future success in an academic field is early competence 
revealed by a subject-matter test. Construct validation requires 
constant attention, however, to ensure that a test is actually 
measuring the construct intended. For example, a reading comPre* 
hension test shoufd not be so compiicated in content as to stress 
reasoning instead of reading, or a mathematics test should not use 
languagethat Places a premium upon knowledge of vocabulary. 

'Criterion- related validities apply when one wishes to inter from a 
test score an indlviduars most probable standing on some other 
variable called a criterion. Statements of predictive validity ^for 
example! Indicate the extent to which an indlviduars future level on 
the criterion can be predicted from a knowledge of prior test perfor- 
mance For many test uses, such as for selection decisions. . . . 

Predictive validity provides the appropriate model for evaluating the 
use ot a test or test battery" {p. 26), Predictive validity is particularly 
Important to the QRE Program because the examinations ere used 
to select students likely to succeed in graduate study, but validity 
based on other criteria (such as seif-rePorted undergraduate 
grades} hasa^so been expfored. 



Content Validity 

Concern lor the content validity of the Aptitude Test is reflected in 
test specifications based on: 1) diversity of topics and points tested; 
2) coverage of fundamental concepts and skii/s: and 3) use of a 
variety of methods of testing skills— for example, antonyms, 
analogies, sentence completions^ and reading comprehension in 
the verbal measure and computation problems. da*a interpretation, 
and quantitative reasoning questions in the quantitative measure. 
The question of content validity is fundamentally whether the test 
adequately samples the domains o) verbal, quantitative, and 
analytical sMNs. 

The first and most important step taken to heiP assure the 
conterit validity of the Advanced Tests is the direct involvement of 
scholars and teachers the discipline of each test in writing, re- 
viewing, revising, seiectingr and approving; questions for that test. 
College professors who are actively engagec^ m teaching at 
recognized institutions and who are therefore believed to be fa^ 



mlNar with the content of typical undergraduate curriCulumS and 
the roquiremdnls of graduate study m m«ir disciplines serve as 
members of the committees of examiners. 

Several additional steps ere teken to aid the examiriers in strtvirtg 
for content validity. Probably the most important additional step is 
the systematic and continual feedback to the examiners of perfor- 
mance data of examinees on test questions. In addition, questions 
about the educetionai backgrounds and goals of e;<em1nees are pe- 
^ liodicelly included in the test book, ^^caminees respond to these 
queations just prior to taking the tests. The responses to 1he ques- 
tion* are analyzed to show the test performance of student groups 
with ditf^rent backgrounds and goals» but the responses have no 
influence on reported scores. Occasionally, a more extensive ques^ 
tionnaire on student backgrounds and reactions to the test is given 
to samples of examinees a test administration. Responses to ihe 
questionnaire are mailed back by the examir^ees following the 
administration. 

From time to timen a representative panel Of college Professors 
reviews a test's specifications and actual test copies m confiider- 
able detail. Some rests are routinely reviewed before printing by 
professors not on the committee of examiners. Inspection copies of 
tests may be requested by college presidents, deanSn or graduate 
department chairmen: forms to be used in evaluating the tests are 
routinely sent with inspection copies and are completed by faculty 
members who review them Articles about certain Advanced Tests 
appear occasionally in appropriate professional journaiSn and 
Presentations on Advanced Tests are made sometimes at 
professional meetings. These articles and Presentations hetP 
secure feedback from test users about the content validity of the 
tests. 

Conetmct Validity 

Construct validity concerns the degree to which the domains of 
skills tapped by the test appear to be related to those domains as 
defined in other contents Construct valtdity of the APtrtudc Test >s 
evinced by the tendency for people in fields requiring quantitative 
sklNs to have relatively high Quantitative scores, and for people in 
fields requiring verbal skills to have relatively high verbal scores. 
For example, hvo of the highest correlations between an Advanced 
Test and the quantitative ability measure are for economics and 
mathematics, both fields in which Quantitative skills a^^e important. 
High correfaiions between the verbal ability scores on the Aptitude 
Test and scores on the Advanced Literature in English Test, the Ad- 
vanced Education Test, and the Adv&nced Philosophy Test suggest 
that the Aptitude Test is truly measuring a verbal construct underly- 
ing performance m those fields. {See Tabie 13 on page 31.) 

The results of predictive validfty studies also suggest that verbal 
and quantitative ability constructs are appropriately reflected in the 
Aptitude Test. In those scientific fields where quantitative ability 
counts most, the GRE Aptitude Test quantitative ability score ic; 
typicalty a better predictor than the verbaJ ability score. Cor^ 
respondingly^ the GRE Aptitude Test verbal score tends to be more 
vatid in such verbally onented disciplines a^; EngUsh and education 
than in scientific fields. Interco.^ Nations of [he verbal and quantita- 
tive measures are sufficiently lOw ( 50- 60) to suggest the inde- 
pendence of the constructs 

One of the most common ways of mvesiigatmg consfruCt validrty 
is through factor analysis. Examination of the relationshios among 
questions in a test contributes to an understanding of the abihtie^i 



ERIC 



that affect performance and has implications for test development. 

The decision of the GRE Board to offer a restructured Aptitude 
Test in October 1977 was based on the presupposition that the 
restructured test should measure the same verbal and quantitative 
constructs as those measured before and that a new mc'^sure 
should lap a construct with unique dimensions. Factor analyses 
were performed to determine whether projected changes in the 
verbal and quantitative measures would be appropriate to the 
constructs as defined by the Original verbal and quantitative 
measures and whether the analytical question types under study 
would have qualities separating them from the verbal and quantita- 
tivedomairis. 

In the first factor analysis (Powers. Swinton. and Carlson, 1977) 
undertaken relation tc investigating the possibility of restructur- 
ing the Aptitude Testn principal factor solutions ware computed for 
tne responses of two random samPles> each consisting of 8>000 
examinees^ taking one of two forms of the GRE Aptitude Test. In ad^ 
dition» the factors the test forms were a^cteoded into each of eight 
experimental subtests, which were administered along with the 
final forms in a spiral design. These experimental subtests con^ 
tained variations of the Aptitude Tost content considered as 
potential constituents of a restructure<: Aptitude Test. Four were 
verbal and four were quantitative in nature. For example, one 
subtest contained short reading comprehension passages con- 
sidered for potential Inclusion, as well as longer ones that had been 
used exclusively in the Original operational Aptitude Test> Others 
contained reading passages with homogeneous content — either 
science passages or humanities and social studies passages. Still 
others contained only data Interpretation questions^ used sparingly 
in the original test, and quantitative comparisons^ a new n>athe- 
matics question type not then used in the Aptitude Test. (For an 
extended discussion of Aptitude Testcoritent. see Chapter 30 

Both forms revealed three factors reflecting the global structure 
of skills assessed by the Aptitude Test. These three major dimen- 
sions of question covarrance accounted for about three-fourths of 
the common variance in each form. Factor I was identified as a 
genera/ t7uanf/faf/ve factor, accounting for nearly 30 percent of the 
common variancOtin each form^ Most of the quantitative questions 
but none of the verbal questions loaded highest on this factor. 

Factor II. accounting for about one-fourth of the common 
variance in each form, was-idantified as a reading comprehension 
(connected discourse) factor. Almost all the questions associated 
with reading passages exhibited their highest loadings on this fac- 
torn and most of the sentence completion questions showed 
substantial loadings on this factor. Physical science passages ap- 
peared less strongly related to the comprehension dimension than 
Passages based on humanities or social science content. 

Factor llf was identified as a voc^bufBry factor (words and con- 
cepts in isolation) and accounted for about 20 percent of the com- 
mon variance of each form. Approximately 90 percent of the 
antonym questions and 70 percent of the analogy questions loaded 
highest on this factor. 

The other less important factors terms of variance explained) 
identified for the first form were as follows: Factor JV. contributing 5 
to 10 percent of the common variance, was identified as an e/e- 
mSntary atgebre factor since each of the five questions having their 
highest loading on the factor involved algebraic notation. Factor V. 
also accounting for 5 to 10 percent of the common variance, ap- 
peared to represent sPeed of response to discrete verdai questions 
(as opposed to speed m questions associated with reading 

53 



passages). Factor VI. accoanting for Less than 5 percent of the com. 
mon variance, was a dlmansion ot variance ondertying certain data 
interpretation questions frorrt Ihe <iuantitalive section. These Pues^ 
tJons were bailed on a complex Qraph or table and reQoired the ax^ 
traction ot inforrrtmion from tliat table or graph. Thus, thrs faclor 
was idemJIiad as the Bbifity to axlracf infomstfonr Factor Vff. aiso 
accounting for less than 5 percent of Ihe corrtmon variance, was 
designated an appficstions: worcf probiems factor. Factor Vtl). ac- 
counting for less than ^ Percent of the common variance, seernad 
lo represent a factor of reading spBad in compre^ens/On passages. 
Fforn the small percent of variance accounted for by this factor, it 
vyas conclucfed that sPeed does ^ot play an imporlant roJe in the 
reading comprehension section of the test. 

The resofts of the factor analysis of Ihe second form resembled 
the resuits of the first. However, h^o differentiated data interpreta- 
tion factors, comparable lo Factor VI in the first lest form analysed, 
were discovered and caOed cfafa /orerpreraf'on: exfrecf/on anc/ 
man/pif/ef^on anCf Cfefa /f7ferpre?afyon; exfraci/oo. In addiliOnn a fac- 
tor termed reading comprehension: scientificJtechnicat was 
idenlified; il accounted for less than 5 percent of the test's cornmon 
variance, Two other minor factors not identified in the first form 
were found: qusntHatfvB speecf (accounting for less than 5 percent 
of the variance) and what was dubbed easy anronyms. though in- 
terpretation of this^fatter factor was somewhat problernatic 

In the second lorm of the test, two factors isolated in the firs^ 
form were not found: eten^Bntary afgebra {characterized by the 
presence of algebraic notation) and ivcrcf prob/ems. The two forms 
werdn however, very cJoseiy similar srnce t^e first three rotated (ac- 
tors, which together account tor approxirnately three-fourths of the 
common and 40 percent of the totaf variance in each form, 
represent the global skiils tapped by the GRE Aptilude Test before 
restructurinQ— One quantitative and h^o verbal factors The quanti. 
tative factor is general in nature by virtue of its high loattings on 
most of the quantitative items. The two verbal factors define 
abilities to deal with connected discourse (reading comprehension 
passages and sentence completions) and with words in isolation 
(antonyms and analogies). None of the remaining factors explains 
more than 10 percent (and most less than 5 percent) of the common 
variance. 

The factor analysis not only identified the construct structure of 
the verbal ar>d quantitative measures but also resulted in recom- 
mendations concerning proposed alterations in the test. The 
Proposed inclusion of short as well as )ong passages in a restruc- 
tured Aptitude Test was supported because the relationships 
among the questions associated with the shorter experimental 
passages were as wetl explained as those associated with the 
longer Passages used in the operational forms. 

Since the content of experimental subtests containing only 
scientific passages was not as well explained by the operational 
test factors as subtests containing nonscienlific Passages* (he 
proposed provision of separate readmg options (one with scientific, 
the other with humanilies/social stutties content) was abandoned. 
Such a change would have made the test a different experience for 
science students than for the other stucfents. 

More than BO percent of the established common variance of 
each of the experimf^ntal quantitative pretests was explained by the 
factors found in the operational test Thus a change to inciude 
quantitative comparison questions m a restructured tost was 
considered acceptable in ^i^w of the objective to retam the 
construct valfdity Qf the original quantitative measu'^e 



From the above discussion it can be observed that three deci- 
sions concerning test specifications for the restructured test were 
related lo construct validity as Investigated by factor analysis: 1) the 
decision to include short as well as long Passages in the reading 
comprehension part of the test: 2) the decision not to separate 
reading comprehension passages by subject matter content into 
two optional modules: and 3) the decision to include some quanti- 
tatrve comparison questions in Ihe quantitativ- abitity measure to 
reduce the testing time, because these questions had high loadings 
on the general quantitative factor 

As staled earliern evidence of construct validity is not founcf in a 
single study; rather, judgments of construct veMdity are based on 
an accumulation of research results. This accumulation is usually 
based riot oniy on severer studies using similar methods, but also 
on investigations using a variety of cfiffereni techniques. Campbell 
(I960) and Gampbetl and Fiske 09^9) have pointed out that to 
demonstrate construct validity it Is necessary to show not only that 
Ihe measure correlates highly with certain other variables, a 
process referred to as convergent validation, but also that it does 
not correlate significantly with certain other variables ffom which it 
should differ, a Process referred to as discriminant validation. A 
good deal of research has been directed at establishing these two 
types of construct validity for the new analytical measure. 

Evidence of the discriminant validity ot the new types of ques- 
tions has been obtained through a variety of judgments made by 
faculty, examinees, and others about whether the questions seem 
to measure something different from the verbal and quantitative 
skills also assess^HJ Aptitude Test^ Correlational analyses 

have shown that experimental tests containing the new analytical 
questions are in general slightly more related to the verbal and/or 
quantitative portions of fhe test than the verbal and Quantitative 
portions are to each other. 

Each of the three types of questions included in the new 
analytical ability measure of the Aptitude Test tntrottuced in 
October 1977 exhibits, fn differing degreesr variance not shared 
with the verbal or quantitative measures^ Results ot factor analysis 
studies conducted to date suggest that the logical diagrams ques- 
tions have somewhat more in common with the quantitative 
measure than with the verbal section, A second tyPe of question, 
analytical reasoning, shows this same pattern. The third type, 
analysis of explanations, howevefn has slightly more in common 
with the verbal section than with the quantitative section of the test. 
After statistical removal of the verbal and quantitative factors, 
however, there remains a unique interpretable dimension for each 
of the three types of analytical questions. Thus, results of factor 
analysis suggested that the addition of an analytical measure to the 
Aptitude Test wouM be supplementary rather than redundant 

Criterion-Related Validity 

Because the Purpose of the Graduate Record Examinations is to 
select applicants who will be best prepared to succeed in a 
graduate study program, the actua! relationship between perform 
mance on the GRE and performance in a graduate program is an 
important concern. 

Predictive Validity 

Of the various ways of exploring criter^on^related test validity, <je- 
terminina the predictive effectiveness of the examination is the 



54 



most practicaK Correlational anaJysis has been the pnncipai ro- 
aearch design for evataatmg methods of selection, particularly m 
the casd of easMy quantifiable selectors such as test scores and 
undergraduate grades. One or more predictors (measures of 
studer)t poteritiai used for selection^ may be ovaluated by the extent 
to which they accurately forecast one or mOreCr^mna (measu>'^s 
Student success). The value of a predictor for selecting students is 
usually considered to vary directly with the sjze Of jts correlation 
with the criterion {Cronbach. 1971). This correlation, the validity 
coefficient, ranges from a chance retationship oJ 0 00 to a perfect 
relatior^ship of I.DOh thouQh negative coefficients car^ occur and 
perfect vahdity is not closely approached m practice. UsuaMy more 
lhan one Predictor is involved (for example, a test and a grade 
average)^ and m some cases a statistically weighted composite of 
the predictors is Typically more useful for seJoction purposes lhan 
either predictor aione 

Although correlational analysis has conceptual simplicity, its ap- 
plication to the study of graduate student selection is complicated 
by a number of serious and often insoluble problems. Ftrsi. m a 
plan to study the effec!ivef>ess of Predictors, a decision must be 
made concerning the criteria by which to ludge their effectiveness 
Graduate grades a*"e readily available and relevant indications of 
Success^ but many fi ulty members dOubt that even reliable grades 
represent the most important outcomes of education Comprehen- 
sive examinations are limited Faculty ratings tend to be unreliable 
Whether a student attains the Ph D. depends on academic 
persistence and probably does not differentiate very well the most 
promising scholars and prolesstonais Yet waiting for Proof of 
scholarship and contributions to a discipline could result in 
indefinite postponement ot a predictive validity study No criterion 
witi be totally satisfactory, and the use of several relevant criteria 
must inevitably represent a compromise 

Seconds almost any information on graduate student perfor- 
mance accumulates slowly. First-year grades are generally the 
earliest obtainable information for use as a criterion information 
on Ph D completion may not collectible for several years after 
the predictor information has been recorded. For exampfe, the 
analytical ability measure o\ the GRE Aptitude Test, introduced m 
October 1977. wiii not be sublected to predictive validity studies 
until criterion information is available — the end of 1979 at the very 
earliest, studies using the Criterion of Ph D completion will not be 
possibfe until the early to middle 1960s 

Another serious problem is that, when studies focus on particular 
institutions and departments, as they must, the number of students 
may be so smai' that findings will fail to be statistically significant 
Repeated studies or* small groups may result Tn wideiy varied find- 
ings. \Mh a predictor appearing very effective one year and Ineffec- 
tive another. Appendix ti presents, for the GRE Advanced Tests, 
summaries of predictive validity studies relevant to each test for 
Which they are available These studies illustrate fhe problem of 
numbers, especially in relatively smati ftelds Concern for cor^tent 
and construct validity is especially important when correfetronai 
studies are difficult to rnterpret because of the small number of 
students involved 

An e<luaily serious problem that results rn deflated validity 
coefficients is the restricted range of students performance on 
both Predictors and criteria From the standpoint of research 
design, an ideal mefhod of studyrng predictive validity would be 
collect selection mformation for all applicants, admit a random 
sample Of applicants. an<J then examine ine fetationship between 



the criterion (performance m graduate school) and the predictors. 
However, to admit students on a random basfS is probably neither 
practical nor ethical. Graduate appticants generally represent a 
highly select group with respect to academic ability and past 
performance. By the time students are admitted to departments, 
further restriction ot range is introduced either directly (when the 
GPc ano undergraduate grades have been used in sefection) or in- 
directly {when other related variables have been used). Restriction 
in range on one or more of the Predictor variables under considera- 
tion makes it difficult to obtam a clear assessment of the actuaf 
value of the predictors involved since observed validity coefficients 
tend to be lower than would be the case if a full range of talent (a 
group representative of all apphcants) could be inctuded i^^ depart- 
mental samples. 

In recent years. GRE verbal ability scores for examinees na- 
tionally have had standard deviations of approximately 125. and the 
standard deviations of GRE Quantitative ability scores have been 
approximately 135^ In departmental samples such as those in^^olved 
in many validity studies, standard deviations of 75 to 90 on one or . 
both of these variables are not uncommon, indicating that the 
range of ability available for study is considerably less than that in 
the totat group of individuals taking the GRE nationally. 

Restriction m the range of Criterion values also complicates the 
interpretational outlook. If criterion values such as graduate grades 
vary only over a very limited range (A to B). differences in student 
performance may not be measured reliably. This tends to lead to 
underestimation of the overall utility of a predictor (Wilson. 1977). 

The effect of restriction of range is often seen in studies in which 
a number of possible predictors are examined — only some of which 
are actually Osed^ Predictors not in gse would be expected to show 
a spuriously high correlation vvith the criteria because students are 
more heterogeneous in those respects; the range of distrnctio is 
has not been restricted by previous selection. 

Although a predictive validity Study is intended to check on the 
effectiveness of predictors, it is not intended to identify from an 
infinite hst of possible Predictors those that stiould be used be- 
cause they provide high correlations. Predictors and criteria should 
have reasonable reliability (stability and freedom from distortion), 
educational relevance, and acceptability in terms of economic and 
administrative feasibility and ethical considerations. Faculty recom- 
mendations and ratings, though educationally relevant, are 
frequently unreliable: that \s. the assessments of various judges are 
not comparable. On the other hand. Predictors with the highest 
validity coefficients may not be edu<<^tiona1ly relevant. For 
example, the variable most highly related to the academic success 
of students might theoretically be possession of an a^omobile. but 
this variable is clearly neither an educationally saund nor an 
elhicaify acceptable predictor 

The correlations obtained in predictive validity studies, if in* 
terpreted in light of the limitations of such studtes and with 
reference to educational and social values, are valuable informa. 
tion A number of institutions and agencies have participated in 
Predictive validity studies including the GRE as predictors, and 
results of some of those Studies are summarized ir^ Table 32. The 
studies summarized here mCfude all or some of the following pre- 
dictors GrE Aptitude Test verbal and quantitative ability scores^ 
GRE Advanced Test scores in students' chosen fields (thus, the 
content vanes, depending on the department concerned), a GRE 
composite {usually the average of two or three GRE scores, though 
this composite was occasionally weighted statistically), under- 



Table 32: Median Validity Coefficients; 
for Various Predictors and Criteria of Success 
in Graduate School 





CriUfU ^ SutCfU 














PrfdietDrt 


Griit. 








Timi 










Attain 






G^A 




nation 






GRE v^fbdi scoro 


.24 


31 


42 


.18 


.16 






27 


5 


4 7 


IS 


GRE Quantitative score 




£ f 


27 










25 


5 


47 


18 


GRE Advanced Test 


.30 


30 


.48 


.35 


.34 


score 




8 


2 


40 


18 


GRE CompOSjte 


33 


41 




.31 


.35 




30 


8 






18 


Undergraduate GPA 


31 


37 




14 


.23 






15 




30 


9 


ReconfimendatiOns 


■ 


■ 






.23 










15 


9 


GRE'GPA Composite 


45 






.40 


.40 




24 






16 





T Tht lOwer numbtr in tach pair (sM <n srrYdlier typti ftprtsenti tht 

number eoffficicnts upon which tjch mtdian i« ba^cd Stc P^Gcs 

60'6? f^r a li$t ol the validity ituclics Summarrfed her« 

*ND(j4ta aVAitabie 



graduate grade-pomt average (undoubtedly computed in differertt 
studies i'^ different ways mat were seldom specified vei7 carefully^ 
recommendations lalmost exclusively from three extensive studies 
of Nationa? Science Foundation fellowship applicants by Creager 
(1965) and Rock (1972). where the average ralmg of several letters 
of rekrence was usedl- and a weighted composite of GRE and 
grade-point average. The criteria success in graduate school in- 
clude gradijate grade-point average, faculty ratings (typtcalty 
representing the composite judgments ot severat faculty members 
concerning professional promise or overall success as a graduate 
stijdent)^ departmental ejcaminations (very tew cases), and attain- 
ment of the Ph.D, This ?ast factor lyptcaify means attaining the 
degree wtthm a certain number of years, so a time element is also 



involved and has been formalized in the lime-to-Ph.D, criterrofT t^V 
assigning criterion scores to 3tuder>ts according to years elapsed 
between B.A. ar^d Ph.D. All the dala concernmg Ph.D, attamment 
come from two studies by Creager {1965). 

The studies summarized in Table 32 cover the period from 1952 - 
f972. afthough about haff were daied during the fast ftve years of 
the span (see the list of studies on pages 60 to 62). Half of these 
studies were published: the other haff were institutional reports or 
theses. Mar^y of the studies were earlier described indjvidualfy by 
LannhOlm (1968^ 1972). Since a report of more recent studies, as 
well as current studies sponsored by the GRE Board, has not yet 
been published, studies completed since 1972 are not summarized 
here. 

The 43 studies in this summary include 138 independent sets Of 
datan usually correspor}ding to departments, though occasionally 
representing some broader group such as first^year students 
across several departments, individual sets of data are based on 20 
to 1h479 students (median N = 80}. The total number of students if^- 
eluded in all studies is 21.214. and the totar number of validity 
coefficients is 616. 

Pr<»dtctsblllty of Graduate Succott. The studies represented in 
Table 32 vary widely \r\ quality ar>d scope. Some are based on small 
samples, making individual correlations unreliable. But the 
medians based on more thar^ just a few coefficients should give a 
dependable idea of how valid these predictors are and how 
Predictable are Ihe various criteria of graduate success. Insofar as 
possible, the same data have been sorted by major field and 
Presented in Table 33 to illustrate differential validity of the pretfic* 
tors for different disciplines. Several observations can be made 
from these tables, 

• Va^^d^ty coefficients for the various predictors and composttes 
(against the graduate grade-point average criterion) tend to be 
about 15 lower than corresponding median coefficients at the 
undergraduate level (FIshmfin and pasaneJia. i960). 

• The undergraduate grade-point average is a moderately good 
Predictor of graduate grade-point average and faculty ratings^ it 
js a Poor Predictor of whether a sttjdent win atta/n fhe Ph D. De- 
pending on Ihe success criterion used, the GRE composite is 



Table 33: Median Validity Coefflclentsr for Five Predictors of 
Graduate Success (Variously Defined*) in Nine Fields 





dkilDlitJii 


! i 

\ Cliiinittn' J tilueitlDit 


Cnllfwtrtnl 
4 ml Applet! 
Se)4ne4 


. ! 

tnliUh . . 


Mith 




i*>ire|ioiDcy 


1 Sot 1*1 
Serine* 


GRE verbal score 


.18 


22 


.36 


.29 




30 


,02 


.19 


i 32 




7 








^ i 


6 


6 


23 


; 1 I 


GRE auantitatwe score 


.27 


28 


28 


.31 


.06 . 


.27 


.21 


.23 


1 

; 32 




8 




14 


1 0 




6 


6 


22 


1 ic 


GRE Advanced Test score 


26 


39 


.24 


44 


.43 1 


.44 


38 


.24 


i .46 




5 


9 


6 


7 


3 i 


5 


5 


17 


5 


Undergraduate GPA 


13 


.27 


30 




^ 

.22 [ 


.19 


.31 


.16 


. .37 




2 


7 


5 


4 


1 


4 


4 


15 


6 


GRE GPA Composfte (weighted) 


35 


42 


42 


47 


.56 j 


.41 


.45 


.32 


40 




3 


6 


7 


4 


2 


3 


? 


4 


i ^ 


*Tne lower number in each p^rr <5M imaiffr typf* 






"in c 


t data Where 


twO Criter 


a w<?r(? incUjded. one ^ 


rtf^t*! delected 



numb*'" of coetfpcrenls hvqo whic^ each m*dMn »s bas^d See Ofl^jss ^or the Purposes of thiS table m the fallowmg order ot ni-n^'jfy GPA. 

60 6? for h1 i(^t of tHe v,i<Jdily studies iunrm.^r i jed ner^ AUtim PhD OeDflrtment Teit ^od F^iculTy RalinB 

ERIC 



either slighHy more valid or substantially more valid than the 
undergraduate grdde>pofot average 

• The GR£ quanlitalive ability score is typically a better predictor 
irt those sctentrfic fields where quarttitative abihty counts fv^^st. 
The reversal in the f^eld of mathematics may be due to restriction 
tn the range Qf quantitative ability scores bf^cause of heavy em- 
phasis on this variable in selection CorrespOndmgly. the GRE 
verbal ability score tends to be more valid in verbally oriented 
disciplines such as Sngirsh and education. Otherwise the pattern 
of validity coefficients is fairly similar frorri One discipl'ne to the 
next. 

• The GRH Advanced Test score is evidently the most generally 
\falid predictor amOng those included. In seven of the nine dis- 
ciplines in Table 33, it has the highest validity among the three 
CjRH scores. In erght of the rtine fields, it has higher validity than 
undergraduate grade^point average 

• RecOmmen^iJations appear to be a farrly poor predictor of wheth^ 
era student wiM succesftfuMy complete a doclorai program. 

• The comprehensive departmental examination seems a 
somewhat mere predrctabl© criterion than the others examined 
tiere. This is an uncertain conclusion because the available data 
are sparse, bul the conc/(jS;on 's consistent with ihe reasonable 
assumption that such a criterion should be more reliable than 
the others represented 

• A weighted composite inctuding undergraduate grade-Point 
average and one or more QRE scores typ+cally provides a validity 
coefficrent m the .40-. 45 range for various cnteria of success 
and for different academic fields This is somewhat higher, than 
the V3*id;ty oi GRE scores aior^e The composite ot ur^der- 
graduate grade-pomt average and G^RE provides a substantially 
more accufate prediction than does undergraduate grade-point 
average aione This is the case tor each success criterion and 
practica'ly ever> acridemic discipline 

UttLtty of Current Predictions. What overall evaluation can be made 
ol fhc extent to which success in graduate school is predictable'? 
Brogden (1946) was the first to demonstrate that the correlation 
coefficier^t ^ is the ratio of the increase obtained by selecting above 
a given standard score on the lest to the increase that would be ob- 
tained by selecting above the same standard score on the Critenon 
itself (p 66). Or, as Cronbach (i971) later stated from the 
standpoint of utility theory, the correlation coefficient 'expresses 
the benefit from testing as a percentage of the benefit one could get 
from perfect p^ed^ct^on of outcomes ' fp 496). Tiius Table 32 indi- 
cates that in most fields the vatue of Prediction by the GRE and 
grade-point average composite amounts to about 40 percent of the 
benefit that cou/d accrue prediction were perfect 

These validly coefficients, m fact, underestimate the usefulness 
of pre<Jict(on in graduate admissions because they are based upon 
students actually admitted rather than tha full range of students 
who apply. There are accepted procedures for correcting this 
restriction in range. The resulting vaUdttV coefficients are always 
higher, and may be sODslanttaily so if only a small proportion of ap- 
plicants are accepted. For example, if a department selects only 
those students above the mean of its applicants on whatever ad- 
missions Cfjferion it uses and the validity of that measure is .40 m 
the admtted group, the validity would be 59 if a!' applicants wert* 
admitted. \r aclU(^' practice. correcri^>t>s for resrrict^on "SuaJiy 



not routinely made because the reasonableness of underlying 
assumptions and. therefore, the accuracy of corrections are 
difficult to ascertain. 

HOW useful validity is in practical terms depends also on the cost 
of gathering the predictor informatfOo. tha proportion of students 
selected, and the importance of the decision. A small correlation 
can produce a targe benefit if the proportion of students selected is 
low. Finally, a given validity coefficient wilJ have more practical 
value if the selection decision is important, and the selection deci* 
sion IS more important if it is irreversible. 

A valrdity coefficient of .20 might be described as modest and one 
of .40 as moderate. The conditions of graduate student selection 
are generally favorable to using predictors of even modest validity. 
In many departments only a Small proportion of students are ac- 
cepted; the decisions are quUe important to the studeht and to ^o* 
ciefy; and the decisions are typically irreversible. There s^ems uttle 
doubt that the GR£ and the undergraduate grade>point average are 
providing quite useful information in most Situations, particularly 
since a given correlation represents greater usefulness in selecting 
among the applicant poputation than its sizo mdicates. if the study 
has focused on admitted students only. 

Figure 6 illustrates graphicaNy the level of benefit iikeiy to accrue 
from using predictors that are valid to the extent indicated. 
Students at high ability levels are far more liKeiy to attain the Ph.D, 
than those at low levels. The figure also illustrates that many 



Figure 6: Usefulness of GRE Advanced Test Scores 
for Predicting Ph.D. Attainment In Three Fields 
(Creamer. 1965) 




i i 1 ■ — J T 1 1 T" 

t 23 4 56769 
GR€ A<}v»ncsd Te«t Stsnlns Scofe 



57 



students fail to attain the degree, even among talented NSF fellow- 
ship applicants. And In these samples reported by Creager (1965). 
there ar« substantial differences in attainment rates among fields. 

It should be emphasized also that validity studies at particular 
schools and departments give varying results. Such variability is 
exacerbated hy the srnall samples often used, bul real variations do 
occur. For this reason, the GRE Board encourages local atudies to 
enable institutions to justify selection procedures and utilize avarh 
able information to maximum benefit. 

Other Evidence of Criterion-Related Validity 

Although prediction is clearly of greatest concern in establishing 
the validity of the Graduate Record Examinations, the criterion of 
undergraduate grade-point average (as reported by students on 
GR£'answer sheets) permits study of iarge numbers of students in 
research on restructuring the GRE Aptitude Test, self-reported 
undergraduate grade^point average was used as a means of 
evaluating the potential usefulness of various measures of 
analytical ability. The correlations for the experimental analytical 
types of questions, three of which are components of the analyticaf 
measure Introduced in October 1977. are reported on page 23^ At 
the same time, correlations between verbal and quantitative ability 
scores arKt undergraduate grades were obtained^ Tables 34 and 35 
show Ihe intercorrelations of verbal and quantitative measures in 
the October 1975 and December 1975 administrations ol the GRE 
Aptitude Test. Since correlations were obtained for several spaced 
samples of various groups^ the ranges of the coefficients are pro- 
videdt along with ranges of scaled score means and standard devia* 
tions^ The samples are slightly higher-scoring than representative 
samples because students answering background qu^tions (on 

Table 34: Correlations of Verbal and Quantitative 
Ability Scores with Self-Reported Undergraduate 
Grades, and Related Scaled Score Means and 
Standard Deviations, October 1975 



1 Sen net immpt** 
K = 4417 


M = 710S.77J1 






















T-1itVffbai ) 3SI 400 [517 5?3 


i2s m 




519 


117 }2\ 


OuanUtalive j /33 ?S9 | 500 5W 


i2i m 


234 ^22 


ft9a 607 


119 126 



Table 35: Correlations of Verbal and Quantitative 
Ability Scores with Self-Reported Undergraduate 
Grades^ and Related Scaled Score Means and 
Standard Deviations^ December 1975 





SimP^ot of HuminitJtt^ Sociil Sc^+ik* 
intl Scifnci Mfjor* Comb^flfil 


?26 m 




D#ttillOi»t 


Tfitai Verbal 


524 


in IH 

l37 138 



major freld. etc) tend to bo hiigher-scoring on the average thsn the 
total population. 

The foregoing validity coefMcients are proba&ly st^flnuated by er- 
ror resulting from the seJf-report nature of the cnterion. from the in* 
consistency of grading practices among departments and rnstitu* 
tioris. and from lumping differerit samples of students together 

Population Validity 

There is another type of validity that warrants special consideration. 
The previous discussion and the evidence presented about the 
validity of the GRE concern the total population of test takers. It is 
increasingly recognized, however, that the total population in* 
eludes a variely of significant subgroups: women, men. blacks^ Ch^ 
canos. foreign students, older students, and so on. It Is frequently 
true that such subgroups have special characteristics that may 
render a test more or less appropriate for them. Therefore, a test 
may be generally vaiid. bul may have, in one sense of another, 
limited validity lor some promrnent group of examinees. Concern 
for and evaluation of such possibilities can be uselully expressed as 
Population validity. 

The term PoputaUon validity has been used in reference to the 
genera lizability of research findings across different population 
groups (Bracht & Glass. 1968: Messick & Barrows. 1972). Messick 
(1975) pointed out that the generality of meaning of a test score 
across groups is an important aspect of construct validity. Writing 
about the Graduate Record Examinations. Willingham (1976) used 
the term popuiation vaiidity as a means of relating several matters 
concerning test bias to the larger issue ot test validity. The rationale 
runs as follows. 

Cronbach (1971) and others have emphasized that '^one validates 
not a test but an interpretation of data arising from a specified 
procedure** (p. 447). That is. ^aUdity pertains not so much to the test 
itself as to whether the test leads to correct inferences concerning 
the nature ol what is measured and Ihe implications of the 
measurement. In that sense, the test is not valid if it means different 
things for different populations or leads to inferences that are 
systematically in error for one group or another. 

A variety of questions about validity stem from the fundamental 
notion of whether or not a test leads to correct inferences. Many of 
these questions were discussed in the previous sections, and. in 
theory, any such questions can be applted to any particular 
subgroup in the population of examinees. In practice, research on 
population validity has focused largely on whether the selection 
measures are free from bias in their content and in the accuracy of 
the predictions ttiey yield. In either case, the question is not 
whether there are any differences in the average scores earned by 
different groups of students. On the average, different minority 
groups do often earn lower scores, but this is not irtcortsi stent with 
other known facts. The content of ttie test is intended to represent 
Important outcomes of the mainstream educational system and the 
abilities necessary to do well in that system. To the extent that many 
members of a minority group have suffered social and educational 
disadvantage, they would t>e expected to find the test difficult, and 
a weit-deveioped test should reflect the educational disadvantage a 
person has experienced. The critical issue in evaluating the validity 
of a test for such groups is rather whether the test fairly represents 
the developed ability or achievement it purports to measure. 

Research on population vahdity is hampered considerably by the 
dtfftcuity of Obtaining relevant data. Often the necessary fnforma- 



58 



ERLC 



6\; 



tlon (such ad subgroup td'^nttficdtion or achievement \r\ graduate 
schoofj is not directly available) \o Iho GRE Program staff and 
proves difficull to obtain from students or institutions. Another 
39rioua problem is thai many gr6ups Of particular interest are 
represented in graduate education in limited numbers, making ap- 
|;roprtate data ah the more difficult to obtain. Due to the nature of 
griiduate education, these problems are compounded because the 
logi;:al locus foi' many studies is the individual department, where 
SiTiail classes predominate and the representation ol relevant 
subgroups is tOo sparse for research purposes. Some of these 
problerrs. however, can be circumvented and a great deal can be 
Inferred <'bout the population validity of the GRE from the consider- 
able amOimt oJ research evidence accumulated m recent years 
about several stniitaf tesfS used in srmilar circumstances. The 
following pi^ra^faphs summarize briefly the principle findings Of 
such research concerning (1) predictive bias and (2) content bias 

With respect (o predictive bias, there are two important <iues- 
tions. One is v^'hether admission tests are as predictive of. or as 
highly correlatfri wtth. college performance for mrnOrities as for 
majority Student:; The second question is whether there iS any 
systematic tendency for the tests to underpredict or underestimate 
the actual performance of minority students once they are admit- 
ted A number of studies have been directed to these two questions 
in undergraduate colleges and <ri law schools The results are 
generally quite consistent 

At the undergraduate level. Stanley (1971) reviewed predictive 
validities for black and white students and concluded that they are 
quite comparable, in t975, Cieary, Humphreys. Kendrick. and 
Wesman again reviewed the situation and concluded that the pre- 
dictions Within black and white colleges are comparable, within in- 
tegrated coneges the usual regression equations lead to com- 
parable predictions for black and white ^fi*denl£ \p ?1) In a 
review of sex bias in selection of freshman colfege students. Wifd 
(1977) found a consistent irend o\ underprediction ot women s 
grades when the regression equation is developed on a combmed 
sample of men and women She hypothesized that the differences 
may be due to systematic differences m meaning of the grade*point 
average cjtU-irtOn. s*r>ce mer^ and women er^ter different ma^or fields 
in differing proportions and different fieid& have different grading 
practices 

Linn (1975) reviewed the prediction of grades tn law school by the 
traditional measures such as undergraduate grade-point average 
and Law School Admission Test scores and concluded that the 
tradittufial predictors of law school grades are usually found to be 
as adequate for minority persons as for maiOfity persons and the 
use ot a Single prediction equation usually favors the minority 
group member (p 43) Pitcher 11975} studied the prediction of 
grades for female law school students in 2^ law schools and found 
that the results of the study wouW m general support the use. for 
either men or women, of a regression system based on data for 
botf7 grouPs combmed as Jong as combinations of predictors (test 
scores and undergraduates gradesl are used IP 1) This result 
supports the hypothesis of Wild (1977). smce no differences were 
found when muitjpie predictors were used and fOf a singie tieid of 
study 

These results are confirmed t?y an earlier review 0/ Linn M^73) 
and by more recently completed Studies of wom(^n. blacks, and Chi- 
canos by Pitcher (1977) arid Powers ( 1977) Jhu^. the results of re- 
search on jests generally smular to the GRE consistently indicar^^ 
thai academic performance of women and minority cttj^erit-^ is 



predicted accurately and fa*rly as compared to predictions of 
performance of males or majority students 

Available data on the GRE are consistent with that pattern, A 
Cooperative Validity Studies project still in progress collected 
validity data for 131 minority students spread through 14 graduate 
departments in 3 universities. Individual correlations in such Small 
samples are quite instable— -the average N was only 9— but the 
median validity coefficients of .33 for GRE Aptitude Test verbal 
ability scores and .3) for quantitative ability scores are comparable 
to or slightly higher than those reported for various groups of 
graduate students »n Table 32 on page 56. Limited analyses by sex, 
based on data collected in the same project, show s/milar valtditjes 
lor women and men and also for foreign and nonforeign students in 
the Quantitatively oriented fields in which foreign students tend to 
be found. Positive relationships between the GRE scores of foreign 
students and performance in graduate school have also been found 
in anearlier study by Harvey and Pitcher (1963). 

A second general class of research on population validity has 
concerned the reasonableness and fairness of test content for dif^ 
ferent subgroups. Even if no evidence of predictive bias in a test is 
found, incorrect inferences may be drawn about the ability of 
students in Particular subgroups because the content of the test is 
somehow inappropriate for those subgroups. Such research has 
centered on the internal characteristics o( the test, particularly the 
question of whether certarn subsets of questions tend to be inor- 
dinately difficult for some group or whether the overall pattern of 
difficulty is quite different for that group as compared to the total 
population of ej(aminees. 

With any test, it is normally assumed that the subject matter or 
totaJ domain of questions may be generaily unfamiliar or harder for 
particular groups of examinees This in itsetf does not argue 
measurement bias, but may well represent a history of educational 
disadvantage with respect to the particular subject matter of the 
test But if ce»1aln clusters of questions that share some particular 
characteristic prove overly difficult, or if the group exhibits an 
unusual pattern of difficulty, one could more reasonably assert that 
there is a bias m the choice of questions or in the process of test 
construction. The bias argument would then stem from an assump- 
tion that the group m question may not have had a fair opportunity 
to learn the particular cluster of suspect questions, or that the test 
queslions generally may not mean the same thing for that grOup. In 
either event, the suggestion is that the domain (that is. construct) is 
somewhat dffferent for the group m Question and that the use of the 
suspect Questions may be inapprQpriate. 

There has been a goOd deal of research of ihts general sort— 
often called item-group interaction studies. Again, much of the re- 
search has been done on tests generally similar to the GRE (Cieary 
and Hilton. 1968. Angoff and Ford. 1973; Breland et a+., 1974. 
Swineford. 1976). If difficult»es of individual questions are com- 
pared for two large groups of examinees, tha difficulty of a Question 
tot one group can usuaify be predicted with a high degree of ac- 
curacy on the basis of its difficulty for the other group (that is. the 
difficulty *ndej(e& correlate on the Order of .95 to 99). When such 
analyses have been made on the basis of samples of black and 
White students, a typical result is to find that practically aii items in 
the admission test are consistently somewhat mOre difficuK fOr the 
tjiack group, and the correspondence of difiicuity from one group 
to the other is still high (correlations on the order of .90). but 
somewhat more erratic than wOuid be the rase if iwo white samples 
were comoarad f Breland et al.. 1974). Such a findir»g might suggest 




ERIC 



that difforenl qu^sttona in the test might be p6rceive<j somewhat 
differently {thet is. measure a dlllerent construct) for students from 
different cultural backgrounds* but differences in the relative 
dilficulty ol questions for different racial groups are typically not 
large. Furthermore, there are similar small dilferences between 
rural and urban groups and also between blacks iri different cities 
Finally* Angoft and Ford {1973) demonstrated that such discrep- 
ancies in relative difficulties acrosD groups were reduced 
considerably by matchihg the two groups ors overall performance: 
that is. regardless of ethnic group, the questions performed simi- 
larly lor black arrd white students of high, medium, and low score 
lavefs. These findings Support the assumption that the questions in 
these tests are perceived much the same way by black and whrte 
students and that the tests are measuring the same thing lor both 
groups. 

Another special interest m much ot the research ori item*group 
interactions has been to identify particular types of questions that 
stand out as significantly more difficult for women, ethnic 
minorities* or other groups and to avoid including such questions 
iri the test^ As yet this has not been a particularly Iruitful line of in- 
quiry. Typically, few questions stand out as unusually difficult for 
the Subgroup, and there is usually no apparent reason for those 
that do. In some studies, however, if has been observed that ques* 
tions associated with material having minority content tend to be 
somewhat easier for minority students than other questions. Due in 
part to such findings^ a careful effort is made to include in GRE 
tests some material especially relevant to different subgroups^ but 
to what extent such representation can t>e undertaken without 
changing the character ot the tefjts or making them unfair to some 
other examinees is an issue not easily resolved 

Figure 7 shows an iMubtrative item analysis for 60 reading 
comprehension questions *n the GRE Aptitude Test. Each point 
plotted in the figure represents the relative difficulty of one ques- 
tfOn lor a sampje of black women = 1.165) and for a reference 
sample generatly representative of all examinees = t,065). For 
illustrative purposes, the difficulty indexes (deltas) plotted here 
were equated so thai the average delta is the same for the two 
groups Shown The correlation between the deltas for the two 
groups is quite high as correiarions ^o (.69) but noticeably tower 
than correlations usually observed across comparable groups^ 
Some consistent differences are apparent. The black women found 
items containing material ^bout blacks to be relattvel/ easier (13 
items below and 4 items above the 45" ectual-dHficulty line), but 
items concerning science relatively more difficui: (16 above the line 
and 7 berow), SuCh results do fiot appear unreasonable m light of 
what may be assumed about learning experiences of black women, 
but the finding does not. in itself, warrants or even argue for dele- 
tion of either type of item. Both have a legitimate rationale for inclu- 
3ion \t) the testn even though there may be small differences in the 
ability of one group or another to answer the differeni items cor- 
rectly. 

Information on such differences can be useful, however^ m 
assessing the appropriateness of test content. For example, in 
restructuring the GRE Aptitude Test, a relatively efficient type of 
question— quantitative comparisons— was considered as a 
compo/renJ of Jhe quantilalive ability measure. This type of ques- 
tion was shown by factor analysis to be highly related to the original 
types of questions m the test, and other research suggested thai 
mmontv studentD tend 1o perform slightly better on this type of 
question than on other mathematical quesf»ons Thus, quanttlafive 



Ffgure 7: Equated Difficutty of Four Types of 
Reading Comprehension Items In the GRE 
Aptitude Test for Black Females and a 
Representative Reference Sample 



17 

18 

• IS 
ft 

I 14 

S 13 
S 11 

3 w 
1 • 

I 8 
m 

I 7 
m 



4. □ 




KEY 



SAd SocW Sctonos 
Bh>loglcslsiKlPfiyilcsl 

□ SynthsflSjiNid Aniummt 

A Social Stuibss 



6 7 8 9 10 11 12 13 14 15 IS 17 IS 1ft 
Equ«Wd DItllcufty for Rstorsncs Ssmpto 



comparisons were chosen for use because they were measures of 
the original construct and hod content and criterion- related 
validity. However, this outcome tended to strengthen the case for 
their uee because it confirmed that the change would not make the 
test harder for minority students. 

In summary, a considerable amount of research on tests like 
those of the GRE support the general conclusion that such exami- 
nations provkle fair assessment of the particular academ^c abilities 
they represents and that they are as prodictive of the succes$% of 
women and ethnic minorities as of admission applicants generally. 
Information available on GRE tests is consistent with that conclu- 
ston^ but other research is still underway and doubtless will 
continue. The GRE Board s concern lor the fairness of the GRE for 
different populations of examinees extends to other related ques- 
tions that may have an important bearing on test performance— 
especially the possibly diflerential effects of coaching, speeded* 
nesSn and guessing habits on scores of minority groups. Research 
on such questions ts in Progress^ 



Validity Studies Summarized In Discussion of 
Predictive Validity 

AJexaKOs. C, E. Th& Gra6uBtB RBCOrd Exam/natrons: Apf/tudo T^sfs 
as screemng devices for students ifJ the Coltege of Human 
Resources and Education. UnpublishGd reports West Virginia 
University, 1967. Reported by G. V. Lannholm in GRE Special 
Report 68-t Princeton^ N.J.' Educational Testing Service. 1963 



60 



Be3co. R 0. The measufemeni and predictton of success trt 
graduate school Ph.D dissertation. Purdue University, 1960 
Reported by G, V, Lann/tolm m GRE Spec^aJ Repori 60- j 
Princeton. N.J.: Educational Testing Service. 1968. 

Borg. W. R, GRE aptitude scores as predictors of GPA for graduate 
students »n education. Educattonai and Psychotogicat Measure- 
ment, 1^63.23- 379-389 

Capps, M, P.H & Decosta. F. A, Contributions of five Graduate Record 
Examinatjons and the National Teacfier Exarrfinations to the pre- 
diction of graduate schooi success. Journal ot Educattonaf Re- 
search. 1957.50, 383-389 

Clark. H. Graduate Record Exarmnauon corretatfons wtth grade- 
point averages m the Department of Education at Northern 
ft/i'nois University. 1962-1Q66. Unpup^shed Master s Ehesis. 
Northern Mlinois University, 1968. 

Conway. Sister M. T The relationship ot the Graduate Record 
Ev^amination results to achtevBrnent m the Qraduate Schoot at 
the University ot Detroit. Unpublished Master s thesis, Universfiy 
of Detroit. 195?>. Reported by G. V. Lannholrrf in GRE Special 
Report 68-1. Princeton, N.J : EducaiionalTesting Service. 1968 

Creager, J. A. A study of graduate faiiowshtp apphcants tn terms of 
Ph.D. attainment. fTechrncaJ Report No. 18J. Washington. D C 
Office of Scientific Personr^el, National Academy of Sciences — 
National Research Council. 1961. 

Creager. J. A. Predicting doctorate attainment wtth GRE and other 
vartabies (Technical Report No 25\ Washington, D C Office of 
Scientific Personnel. National Academy of Sciences— National 
Research Counci^t T965. 

Dawes, R. M. A case study of graduate admissions Apphcation of 
three principles Of human decision making. Amencan Psy- 
chologist, 1971.26, ieO-188 

Duffn F. L . 3i Aukes. L. E The retattonship of the Graduate Record 
Examination to success m fhe Graduate CoUege {a supple- 
mentary comparative analysis of eight previously reported 
studies). Bureau ot Institutional Research and Ottice ot Instruc- 
tional Research. University of Illinois^ October 1966 

Eckhoff. C. M Predicting graduate success at Wfnona State 
College. Educattonat ^nd Psychoiogscat Measuremenr. i966, 26. 
483-485 

Ewen, R B. The GRE psychology test as an unobtrusive measure of 
motivation. JOurna/o/^pp/^ed Psychology. 1969. 53. 383-387 

Florida State University, Office of nAcademiC Research and Pfan- 
ning. The prediction of grade-pomt average m graduate schooi af 
the Florida State University. Parts I A IT Florida State University. 
December 1971 

Florida State UnTve^s^ly, Office of Insiitutional Research and 
Service Reiationshtp between Graduate Record Examinations 
Aptttude Test scores and academic ac/7;evemenf m the Graduate 
Schoof at Fiorida State University. Florida State University. 1956 

Hackman, J. R.. Wiggins, N., & Bass, A R Prediction of long-term 
success in doctoral work in psychology Ecfuc^^bornu .^/f .' 
Psy^hologicai Measurerr^ent. 1970.30. 365-374 



Hanseor W. L Prediction ot graduate pertoimance tn economics 
Departrrent of Economics, University of Wisconsiri. April l970. 
(Mimeograph) 

Harvey. P R. Predicting graduate schooi performance in education. 
Unpublished ETS report. i963. Reported by G. v. Lannholrp in 
GRE Special Report 66-1. Princeton, N.J.. Educational Testing 
Service. 1966. 

King, D C , & Besco. R. 0, The Graduate Record Exarrfination as a 
selecUon device tor graduate research fellows. Educational and 
Psychofogicaf MeasurerDent. 1960.20. 653-858. 

Lannholm. G. V„ Marco, G. L., & Schrader, W B. Cooperative 
sfi>d;es of predicting graduate sc/ioo/ success (GRE Special 
Report 66-3). Princeton. N.J.; Educationai Testing Service, 
August 1968. 

Law. A. The prediction of ratings of students in a doctoral training 
program. Bducationai and Psychotogicat Meesurement, 1960.20, 
8^7-851. 

Lorge, I fle/afyorrs/v^p t^etween Graduate Record ExarDinations and 
Teachers Colleger Coiumt^ia University, doctorat verbaf examina- 
tions ^Letter to G. V. Lannholm dated September 21. I960). 
Reported by G. V, LannhoJm in GRE Special Report 66-1, 
■Princeton, N.J.. Educational Testing Service. 1968. 

Madaus. G. F.^ & Walsh. J, J. Departmental differentials in the pre- 
dictive validity of the Graduate Record Examinations Aptitude 
Tests Educational and Psychological Measurement, 1965. 25. 
1105^1110. 

Mehrabian, A. Undergraduate ability factors m relationship to 
graduate performance. Bducationai and PsychOfogicaf Measure- 
ment, 1969,29, 409-419. 

Michael, W B.. Jones, R^ A,. Al-Amir. H,. Pul!»as. C, M„ Jackson. M., 
& Goo, V. Corretates Of a Pass-fail decision for admission to 
candidacy in a doctoral program. Educational and Psychoiogicat 
Measurement, 1971.3f. 965-967. 

Michael, W. B., Jones, R. A,, & Gibbons. B. D. The prediction of suc- 
cess 'n graduate work in chemistry from scores on the Graduate 
Record Examinations. Educational and Psychological Measure' 
ment, i960,20. 859-661. 

Newman, R, I. GRE scores as predictors of GPA for psychology 
graduaJe sfudenJs. Educational and Psycho/ogtcaf Measurerr^ent, 
1968,25, 433-436. 

Ofnce of Educational Research. Study of GRB scores of geology 
students matricufating in the years 1952-1961 (RP— nAbstract. 
Vale University. 1963) Reported by G. V. Lanriholm in GRE Spe- 
cial Report 6e-t. Princeton, N.J.: Educational Testmg Service. 

1968 

Olsen, M The predictive effectiveness of the Aptitude Test and the 
Advanced Biotogy Test of the GRE m the Yafe Scnooi of Forestry 
(Statistical Report 55-6). Princeton, N.J.; Educational Testing 
Service, 1955. Out of print. 

Roberts, P. T. An anaiysis of fhe relationship between Graduate 
Record Examination scores and success fn the Graduate School 
of wak^ Forest University Unpublished Master^ thesis, Wake 
Fo.'^est UnfverSrty, 1970 



Robertson. & Nttilser^. W. 7ho Graduate Recofd Examination 
and %e\ecX}Ono\qrBCua^es^ii<fer\Xs.Am0rfCsrtPsYC^otogtSt. 1961, 
648^650 

Rottinson. D. W. A compartson of rwO batteries Of fesJs as predtc- 
tors ot first year achievement in the graduate sc^oot of Bradfey 
Umverbity. Ph D dissertation^ Bradley University^ 1957. Reported 
by G. V. LannhOtm in GRE Spectdl Report $8-1 Princeton. N.J : 
Educationiil Testing Service^ 1968. 

Rock. D. A. Tfte predicuon of doctorate attamment in psychofogy. 
mathematfcs. and cherrMStry (GRE Board Preliminary Report). 
Prmceton. N.J.: Educational Testing Service^ August 1972 (ERIC 
Document Reproduction Service NO. ED 069 6tJ4). Later 
Published as GRE BoarC Research Report 69-6aR. June 1974. 

Roscoe. J. T.H A Houston. S. R. Ttiepredictive validity of GRE scores 
ior a doctoral program \n education Edtycar/onaf antf Psy- 
chofogicaf Measurement, ^969.2Q. S07-609 

Sacramento Sta*e College. Test Office. anaiysis of tradittonaf 
preOfCXOr vanables and various criteria of success 'f^ Master s 
degree program at Sacramento State Coifege for an exper/menfa/ 
group wfio received Master s degrees in the spring i96B, and a 
comparabfe controf group who Withdrew from their programs. 
Test Olfice Report 69-3. Sacramento State College. October 1969. 

Shafferr J.. & Rosenfeid. H. MAT-QRE prediction study^fmttaf 
^resutts. Intradepartmenial memorandum. Department of Psy- 
chology. University of Kansas- Marcn 1969 

Sistrunk. F. The GflEs as predictors of graduate schoof success tn 
psychology (Letter to G. V. Lannholm dated October 3, 1961). 
Reported by G. V, Lannholm in GRE Special Report 66-1 
Prmceton, N J . Educational Testing Service, 1968 



Sieeper. M L. Relationship o1 scores on the Graduate Record 
Examination to grade point averages of graduate students In oc* 
cupationaf therapy. Educattonaf sf)d P^ychofogicaf Measure- 
ment, 1961, 2J, 1039-1040. 

Tulfy. G, E. Screening applicants for graduate study with the Ap- 
titude Test of the Graduate Record Examinations. Co//dge and 
University^ 1962. 3$, 51-60. 

University of Virginia. Office of Institutional Analysis. Correistions 
between admii^^tor^s criteria and University of Virginia grade* 
point averages. Graduate School ot Arts and Sciertces* FaH ^964, 
University of Virginia, circa 1966. (Mimeograph) 

Wallace. A. D. The predictive vaiue of the Graduate Record Exami- 
nations at Howard University. Unpublished Master's thesis. 
Howard University. 1952. Reported by G, V. Lannholm in GRE 
Special Report 66-1. Princeton. N.J.: Educational Testing Servicen 
1968, 

White. E. L. The refationship of ffte Graduate Record Examinations 
resuits to achievement in f/re Graduate Schoof at the University of 
Detroit. Unptyblished Master's thesis* University of Detroit. 1954. 
Reported by G, V. Lannholm in GRE Special Report 6fr1. 
Princeton. NJ.: Educational Testing Service- 1968. 

Whiten W. A predictive validity study of the Graduate Record 
Examinations Aptitude Test at the University of fowa^ Un- 
published Master's thesis. University of Iowa. 1967. Reported by 
G. V. Lannholm in GRE Specia* Report 68-1. Princeton. N.J.; 
Educational Testing Service. 1968- 

Williams. J. D.. Harlow. O.. & Grabp D. A longitudinal sliidy 
examining prediction of doctoral success: Grade-point average 
as criterion, or graduation vs. non-graduation as criterion 
JournafO^Edtycaf/Ortayflesearcft. 1970-64. 161-164. 



62 

ERLC 



References 



American Psy^chologicai Associatjon. Siendards for educ&tfonaf 
and psychotogtcat tests. Wash*rigton D C Amtjrjcan 
Psychological Association^ 1974. 

Angoff. W. H.. & Ford. S F. Kem.race inleraction on a test of 
scho^astic apti ude. Joumaf of Edocaiionai ^feasu^ement 1973. 
10, 95-105. 

Bracht. G. H.. ^ Glass. G. V. The external valtdity of experiments 
American Educational Research Journal 1963, 5, 437-474. 

Breland, H. M,. Stocking. M.. Pinchak. B. M.- & Abrams, N The 
CTOAS'CuitUTQi stabitity of mental tesi items (Project Report 74-2) 
Pfinceto^H N.J : Educational Testing Service. 1974. 

Brogden, E On ttie interpretation of the correlation coefficient 
as a measure of predictive efficiency. Joumaf of Educational 
Psychotogy. (946.37(2). 65-76. 

Campbell. D. T. Recommondalions for ApA test standards regard- 
ing construct, trait, and discriminant validity American 
Psychologist, i960. t5. 546-553 

Campbell. D. T., & Fiske. D W Convergent and discrimmant valida* 
t»on by the mulhtrait-multimethod matrix Psychological Butietm. 
1959.55. ei-l05. 

Cleary. T A & Hilton. T. L An mvestigaiion of item bias Educa* 
honai and Psychological Measurement. 1966.28. 61-75 

CJeary. T A . Humphreys. L . Kendrick, S A , & VVesman. A Educa- 
tional use of tests with disadvantaged students. Amertcan 
Psychologist, 1975.30, 15-41 

Crsager. J A. Predtctmg doctorate aitamment with ORE and other 
vansbtes ^Tec^^f1fca^ Report No 25; Washington. D C . Offjce of 
Scientific Personnel, National Academy of Sciences — National 
Research Council 1965 

Cronbach. L J Test validation in R L Thorndike (Ed fducaf^ona/ 
fT7easi/refT7enf (2nd ed.i Washington. 0 C American Council on 
Education. i971. 

Fishman, J. A . & Pasaneila. A K College admission^seiection 
studies RBvie^ of Educat'onat Research. 1960,3a 296-310 

Harvey. P R.. & Pitches, B The relationship ot Graduate Record 
Examinations Aptitude Test scores and graduate school perfor- 
mance of foreign students at tour American graduate schools 
(GRE Special Report 63-1) Princeton, N J Educational Testing 
Service. ApriE i963 

Lannholm, G. V. Review of studies employing ORE scores m pre- 
dicting success m graduate study. 1952-1967 fGRE Special 
Reporf 68-1). Pnnceton. N J . Educational Testfr^g Service, March 
I96ea 

Lannholm, G. V Summaries of ORE \faiidity studies 1966-1970 
(GRE Special Report 72 1} Prtnceton, N J Educational Testing 
Service. February 1972 



Linn, R L Fair test use in sefect<on. Revtew of Educational Re- 
search. 1973. 43. I39''l6l. 

Linn. R. L. Test bias artd the prediction of grades in ^aw school. 
Journal of Legal Education, i975. 27. 293-323. 

Messick. S. The standard problem: Meaning and values in measure- 
ment and evaluation Amencan Psychofogist, 1975. 30(l0), 955- 
966 

Messick, S.. A Barrows. T. S. Straiegies lor research and evaluation 
in early childhood education, in Nationai Socmty for the Study of 
Education, Seventy-first yearbook. Part tf. 1972. pp. 261-290. 

Pitcher, A further study ot predicting taw scftoo/ grades tor fe- 
male law students (Law School Admission Research Report 
LSaC-75-3). Princeton, NJ.i Law School Admission Council. 
1975. 

Pitcher. B, Subgroup validity study. Report #LSAC-76-6. In Law 
School Adm ission Council. Reports of LSAC Sponsored Re- 
search: Volume itt. 7975-7977, Princeton. N.J.: Law School Ad- 
mission Council. 1977 

Powers. D. E Comparing predictions of Jaw school performance for 
black. Chicano, and white law students. Report #LSAC-77-3. tn 
Law School Admission CoUnciL fleporTi of LSAC Sponsored Re- 
search: Volume til. 7975-7977. Princeton, NJ.: Law Schoot Ad- 
mission Council, 1977, 

Powers, D. £,h Swinton, S. S.. & Carlson. A B A factor analytic fttudy 
ot the Ore Aptitude Jest (GRE Boaiu Professional Report 75- 
i1Pj. Princeton. N.J.: Educational TesUng Service. August 1Si>77. 

Rock, D. A, The prediction of doctorate attainment in Psychology* 
mathematics, and chemistry (GRE Board Prelimina-'V Report), 
Princeton. N,J,: Educational Testing Service, August 1972 (ERtC 
Document Reproduction Service No. ED 069 664). Later 
published as GRE Board Research Report 69-6aR. June 1 974. 

Stanley, J. C. Predicting college success of tt^e educationally disad- 
vantaged. Sc/ence, 1971. ^77, 640-647. 

Swineford. F. Comparisons of black candidates and Chicano candi- 
dates with white candidates (LSAC Report-72-6). In Law School 
Admission CounciL Reports of LSAC Sponsored Research: 
Volume If. 1$70-1974. Princeton, NJ.: Law School Adrrrission 
Council. 1976 pp. 26V263. 

Wild. C. L. Statistical issues raised by Title IX requirements on ad- 
mission procedures. Journal of the National Association for 
Wbmen Deans* Administrators, and Counselors. 1977.40, 53-56. 

Wiliingham. w. W, Predicting success in graduate education. 
Science. 1974. 783. 273-278, 

Wiliingham. W. W, Validity and the Oraduate Record Examinations. 
Pnnceton, NJ.: Educational Testing Service. i976. 

Wilson, K. M. Ore cooperative validity studies project: Extended 
Progress report {7S-&). Unpublished report to the GRE Board Re* 
s&arch CommtTlee. 1977 



APPENDIX I 



Four Types of Questions Studied but Not Selected iof Use in the Analytical Ability Measure 

of the GRE Aptitude Test 



Letter Sets 

ABCDEFGHIJKLMN0PQRSTUVWXV2 

D/r*ef/oni; E«ch problem consists of ffve groups of letters. You 
are to find, lOr each group of lelterVi a pattern that depends on/y 
on th« relative order Irt the alphabet of the letters in that group. 
Then cKoese the one group whose letters denot show the same 
pattern as thel shown by the letters in the other groups. 

1, <A)ABCD <B)DEFG (C)JKLN 

(D)VWXY <E) PQRS 

The patterf> »n groups (A). (B). (D), and (E) c;*n be summarized as 
foltows: The letters in the group are in consecutive alphabetical 
order. The letters m The group (C) do not show trus pattern. 
Therefore, the correct answer is (C) 

Each group of letters should be cor^stdered independerttly. Look 
at relationships wfthm groups instead of retationships betweer^ 
groups. Do not concern yoursetf with whether the letters fall near or 
at the beginning or end of the a(phabet, or with whether the letters 
are vowels or consonants. Do not consider differences m the 
sounds represented by the letters, or the relationship of the letters 
to any other groups of letters, such as words. Consider only the 
relative order in the 4lphabet of the letters in a group 

2. <A)CEGI <Q)EGIK (C) GIKM 

(D)fKMP <E)PRTV 

The pattern in groups (A). {B}, (C). and (E) can be summanzed as 
follows: Each of the four letters is separated from the next by one 
successive letter The tetters in group (0) do no< show this pattern^ 
so (0) is the correct answer. (Note: It would be wrong to choose 
answer (E) on the basis ol the statement Ail of the other groups 
contain the letter }. Th*s statement has nothmg to do w;th the rela* 
tive order of letters within each ot several independent groups } 

Loglcat Reasoning 

D/recf/ons; The questions in this section require you to evaluate 
the reiiaonfng contained fn brief efatetnenis or passages. In some 
questlonSt each of the choices is a conceivable solution to the 
particular prot>lem posed. However, you are to select the one that 
is beeti that Is^ the one thai doea not require you to make what are 
by cotnmon-eense standards implausible. Superfluous, or income 
patlble assumptior>s. After you have chosen the best answer, 
blacken the correspondlr>9 space on the answer sheets 



1. Since all rabblU thai I have seen have short laJbi all rabbits 
probably have shortlails. 

Which of the following moat closely parailels the klr>d of rea' 
sonfng peed In the sentence above? 

(A) Since all chetnlcal reactions that I have aeen have been 

undramalicr probably only minor changes took place In 

the substances involved, 
{B) Since all the human adclal systems thai I heve heard of 

heve sexual taboos^ ail of these sexual taboos have 

probably had Survival value for the human race, 
(C) Since all of the pisys of Jovfta Matdonado ttiat I have 

aeen feature a spurned lover^ probably atl o.' her plays 

feature this character type, 
{D} Since alt eating utenstia that I have seen are made of 

metati melal Is probably the most desirable material 

for eating ulensJIS, 
(E) Since sight Is the most Itnportant of man's five major 

senses, Its failure probably seriously affects an indi' 

vidual's aptitude for all formal education. 

The statement on which this quite easy quest»on is based reflects 
mducttve reasoning: generalizing about an entire class on the basis 
of specrfic observations. Although one cou^d criticize the conclo- 
sion hy porting to the limitations of the ot>servalions. this question 
does not ask 'or an evaluation of the reasoning process but for 
recogr>ition of a parallel exampto of that kind of reasoning 

All of the answer choices are similar in some ways, but only one is 
a statement about specific observations followed by a generafiza- 
tion based ot> those observations. In (A), {B). and [0), the second 
part ot the statement is not a generalization based on the observa- 
tions mentioned m the first part but is an exptanation or suggested 
reason for what was observed. In (E) an assumption is followed by a 
conclusinn. Only (C) refers to specific observations (about some of 
Malc'onado's pfays; ar>d proceeds to genereJize {about a)i of Ma^do- 
nado s plays) on the basis of these observations, however limited 
they n^ay be. Therefore (C) is the correct answer, 

2. A good hotel can give you a beautiful room for $30 a day, with 
three meatSr and make a profit and pay laKea, And yet a tSK' 
exempt hospital operates in the red for $65 a day, I say It must 
be bad administration. 

The author's argument would be cor>slderabty weakened If at- 
tention were drawn to the fact that 

(A) hotel managers receive better training than do hospital 

administrators 

(B) the qualfty of food served by hotels SKceeds that of food 

served in hospitals 

(C) hospitals are run by dishonest admlniatrators 

(D) hospitals provide other services besides room and 

board 

(E) hospital deficits are a recent phenomer>on 



64 



This very easy question focuses on the reasonableness of drawing a 
coriclusion fronn the evtdence presented. The author's contentmn is 
ba^ed on some evidence— the discrepancy between a hoters and a 
hospital's operating expenses. Thequestton asks you to tdentiTy ad- 
ditronal evidence that would weaken the argument that bad 
administration ^s responsible for the discrepancy in expenses. (A) 
and (6) cannot be that evidence because these ctioices. if true, 
would actually strengthen the auttior's argument. (C) is a sMghtly al- 
tered version ol the author's own statement of a reason for the dis- 
crepancy. (E) could weaken the argument, but only il more informa- 
tion were given. Only {0) provides evidence that casts doubt on the 
arQument. tf hospitals provide services other than those mentioned, 
then the costs of those services rather than bad administration are 
hke<y to be the reason for the difference between hotel and hospital 
expenses. Therefore (0) is the correct answer. 

Oii#tt/oii« 3^ refer to the following paasage. 

A Mrvint who wii routing » itork lor hli meiter wai preva^ed 
upon by hli iweatheirt to cut off one of Ita legi for her to eat. 
When the Wrd wii brought to the table* the matter eiked what 
had booomo of the other leu. The man anaweradthat etorka never 
hftd fiore then one leg. The maater. very angry but determined to 
reruler hia aervant apeechleaa before he punlahed himf tuok the 
eorvani the next day to tn» ftalda where they aew atorka. each 
atending on one leg. The aervant turned frlumphantly to the 
meeter; but the matter ahouted, and the birds put down their 
other lege end flew away. ''Ah, air/' aalctthe aervantf "you djd not 
about to the etork at dinner yeaterday: Jf you had. he too woutd 
hav« ahown hIa other leg." 

3. The aervant'e final retort to hla maater would be true If which 
two of the following atatementa were slmu^eneoualy true7 

I. Roaated atorka at the dinner table behave J^^t as tlve 

atorka In the field do. 
IL The mlaaing leg on yesterday's roasted atork had 

actually been tucked under the btrd. 
Itl. The msaater had not undertaken Fo teach the servant e 

tenon. 

IV. The aervent'a sweetheart, rather than the servant 
himself, had cutoff the stork's leg. 

(A) I and II <B) I ancJ ttl (C) tJ and Ml 
(D)tlandlV <E)lllandlV 

The humor of ihe fable on which these questions are based derives 
from a lorjicai problem relating to the drawing of conclus^ons frOrr 
evidence. The Orst Question is based on The servant s clever retort 
to hia master. The servant s retort >s a logical conciusion *f certain 
assumptions are made. The servant's argument— that if the master 
had shouted at the f^oasted stork on yesterday's table, il would have 
shown its other leg as did the storks in the field— assumes that 
roasted and live storks behavein thesame way (I) and aiso assumes 
what the servant would like the master to believe, that the missing 
teg on yesterday's roast had been tucked under the bird (11) rather 
fhan eaten by the servant's sweetheart. If (til) were true, the situa- 
tion Jeading to the servants retort would not have occurred, but (Mi ^ 
does not b**ar on the truth of the servant's retort {IV) is not the 
answer because <t does not matter who cut off the stork s leg. what 
ia important to the situation is that it wa^ cut off Thus (A), (I and ih. 
is the correct answer Th\s c{ue$lion fs of about average difffCulty 



4. The aervAnt waa able to attack the maater'a demonstration 
primarily becauae the master tailed to 

{A) objectively consider the posslb^lty that storks might 
have only one leg esch 

(B) take anyone with Mm to the fields to conlirm his observa- 
tions 

iC) plan later experlmenta to follow up and validate hla 

tentative findlnga 
(D) reveal to hla aervent that \X waa poaalbie for storks to 

lack one log snd still fly 
iE} consider that conditions governing the demonstration 

were unlike those ol the previous day s happening 

Thts question, a moderately easy one. asks for a criticism of the 
master'*; demonstration. The master did prove what he had 
intended to prove— that storks have more than one leg^ However, 
the servant was able to elude the master's " proof of his guilt by im* 
mediately accepting the two*leg9edne5$ of storka (including that of 
a roasted stork) but claiming that the master had simply not treated 
the roasted and Mve storks in the same way. Thus {E) is correct be* 
cause the master's experiment took Place under different condi- 
tions than did the previous day's experience. (D) is Irrelevant to the 
argument and can be easity eliminated. Choices (A). {B), and (C) all 
sound as if they might be flaws in an experiment intended to be 
scientific. But they do not explain why the servant was able to 
spring back with a new argument. Objective consideration of the 
possibility that storks might have only one leg each. (A), would not 
have strengthened the master's demonstration. More observers. 
{B). or additional expehments. {C). to confirm his lindings would 
perhaps have strengthened his point about live storks but would 
have had ^o implications for roasted ones. Thus only (E) can be the 
answer. 

Questions S-$ reler to missing portions of the loilowlng passage. 
For each Question* choose the completion that Js best according 
to the context ol the passage. 

If a book dtsgusted everyone, no one would read It. Koweverf 
one can be sure of salting many copies of a book that Is publicly 
proclaimed obacenOf for the officially held standards ot propriety 
do net prevail throughout the community. At this point I may be 
expected to denounce the hypocrisy of the age. I shall not do so. 
The concept of hypocriay applies to morals: a person should be 
good and not merely seem aor and a bad peraon Is little mended 
by pretenne of goodness. But propriety Is aftogether a matter of 
how actions Appearf so that^. tf a person seema to givo no of* 
fensOf he gives no offense. Why* then* should e society not have 
public standsrda of propriety different from thoae applied by each 
cltlten to his own private conduct? It would be no more absurd to 
advertise fithy movies by decorous posters then It Is to advertise 
decoroua movies by fttthy posters; and If a sodety In which 
everyone avtdly read pornography were to forbid Ifs public salOf 
thai would mean only that It combined a taste lor such reading 
with a laste tor 6. 

$. (A) bad men rarely succeed In appearing good 

(B) the concept ol hypocrisy does not appty 

(C) actions appear different to observers with different stan- 
dards 

{D^ the lasue la basically a moral one 

the concepta of Impropriety and Immorality are Indistin- 
guishable 



65 



Horn miS qu^srjon anO m© next require thai you fo/low the autr^or s 
feasontng well enough to fill m missing material The first question 
focuses on ihe distmctton lhat the author makes between morahty 
and propriety The author suggests that propriety, unhke morahty. 

entirely a matter of appearance and has nothing to do with whai 
IS reafjy good or dad Because the author contrasts morality and 
propriety. (D) and (£) can be elirninaied (0} assumes lhat propriety 
■s a rnorai iSsue. {£) suggests that impropriety and immorality (and 
by inference propriety ahd morality) are indistinguishable Smce 
tiad and good m (A) refer 10 morality. (A) does not follow from 
rhii ?;urhnr r ideas on oropnefy (Ci is an appeafing answer, smce it 
focuses on the way acttons appear m public But {€) does not follow 
fromthewords sothat. leading from the first clause (propriety is 
altogether a matter of how actions appear ') to the second clause 
Only (B) IS an acceptable answe. It follows Uom the author's dis- 
ijnclion between oropnety and morality oecause it states that hy 
fjocrisy {asSDCiaied with the concept of morality) does not apply to 
the question of propriety This Question is drfficult 

6. (A) obscenity (B) hypocrisy (P) oppression 
(P) good art (E) decorunn 

Tiie answer lo thfS very drfficuil question must meet two requrre- 
ments it ^n^st describe something that is consistent, in the context 
of the author s discusSton. with the reading of pornography 11 must 
also explain the paradox, described m the preceding lines, of a so- 
ctety fn which pornography is read although public sate of por- 
nography IS forbidden The sense of the authors argument is 
that it IS possible not to call such a society hypocritical (hhes 4-9) 
Therefore (B) IS a poor Choice (A), obscenity. iS a term simOar to 

pornography and does not suggest any contrast that could 
explain the paradox Oppression and good ari. though relevant 
in general to the toptc of pornography, are not relevant to the 
author s discussion here Thus (C) and ID^ can be eliminated (E). 

decorum, a synonym for propriety, does fit into the context It 
explains the contrast betv.feen prtvately reading joornography and 
pi/bhcly banning it according to the authors view that propnety 
has lo do With Public appearance onfy and not with private actions 
Thus (E> is tne correct answer 



Sample Set 

Between 10:00 p.m. And 11:00 p.m. on Oclober 31, Hve chUdren 
were adnnltted to the Falrchlld General Hospital. Four were suttsr^ 
ing from severe stomach cramps and vomiting. All tive had been 
out ' trick or treating ' and had eaten a good deal ot candy. Doc- 
tors a1 the hospital diagnosed cyanide pot^onfng and ceiled the 
police. The police ascertained that only one streett Mavis 
Avenuet had bean canvaaaed by all tive of the children, three in 
one group and two in another. Their bags of candy were im- 
pounded. 

When the reeidenta of Mavis Avenue were interviewedt severai 
mentioned that John Amest their netghbort had eald. Tm going to 
give some ktds a Halloween they won't forget." Records al the 
corner pharmacy indicated that Ames had purchased cyanide on 
October 29. 



Conclualon: Ames poisoned the children. 

Sample t; Some of the candy 
remaining fn the 
children's bags 
contained cyanide. 
Ames'e ftngerprints 
v/ere found on the 
wrappers of the 
poisoned candy. 

Sampte2: All five children re^ 
membered going lothe 
Ames hou^e. 

Sampie3; Amee had 9fven pen- 
nies, not candy, to the 
children. 

Sample 4^ Three of the five hospi- 
talized children did not 
think they had gone to 
the Ames house. 

Samples: Several ot Ames's 
friends said that they 
would vouch for hts 
character^ 



Sample Ansv/ere 



• <© © <D ® 
© ♦ O <S) CD 

<S> <D • ® O 
© <D O • © 

O ® O O • 



Evaiuatton of Evidence 

Directions: Each Of the »ets in this section constats of a descrip* 
tion of a fact situation and a conclusion based on that situation. 
Following each conclusion are a number ot statements. 

Cons/c^er eac/i sfatemenf seperafe/K ii relation to the tact situa- 
tion. Then on the answer sheet blacken space 

A If ihe statement either proves the conclusion or makes it al- 
most certalniy true: 

& it the statement supports the conclusion but does not make >t 
almost certainly true^ 

C if the statement either disproves the conclusion or makes it al- 
ntoat certainly false: 

D if the statement weakens ttie conclusion but does not make i1 
almost certainly false: 

E If the statement is irrelevant to the conclusion or aftects it only 
silghtly. 



Deductive Reasoning 

D/rect/ons: The questions in this section are based on diagrams 
consisting of a rectangle divided into 4 regions. 

12 3 4 



A plus sign ( -) in a region represents the statement lhat there Is 
something In the region. A minus sign ( - ) In a region represents 
the statement that ttiere is nothing in the region. 



There Is something 
in the first region. 







f 













There is nothing in 
the first region^ and 
there is something 
in the fourth region. 



ERLC 



H • brcdcttt (l ) ) connccti two tl^ni* Ihtn ono of tho iJgni hoid» 
•nd th« ofxpoftHtt of th« oth«r tlgn holds. 



If two dlagrftmi m given* Iho InformaMon in them may be cO'^ 
trined In e eingle dlegram. 



If equivalent to either 









■ 




r 






1 


+ 








or 











Given 



+ 


+ 




j and 











then 



muat result. 



^ taro plus signs or t^ro minus signs are connected by a bracket* 
than one plus sign snd one minus sign must result. If s plus sign 
sfMl a minus sign are connected by « bracket* then two plus signs 
or two minus signs must rasult. 



Sempfe Question 



Gfven 











and 











Is equivalent to etther 



Which of the following can rasult? 











or 


+ 


+ 










(A) 








+ 











If sn arc connects t^o signs* then AT LEAST ONE of the 
signs holds* end BOTH signs may hold. 



Is equivalent to either 



or 



or 



(B) 



{0} 



(D) 



D Is the corrsct answer. 



67 



ERLC 



APPENDIX JJ 
Information Unique to Each Advanced Test 



On the foilowing pages- the content of each o\ the 20 Advanced 
Tests is d^cribed. In addition, tor all the tests except Advanced 
Computer Scfence. the percentage distributions q1 students^ 
responses to questions about therr f^eM backgrounds and educa- 
tional goats are reported^ Ttiese data are based on the responses of 
students taking the Advanced Tests at aM administrations dunng 
the 1970-71 academic year. The percentage not responding at all to 
each question is not given. 

VaMdity data are also summarized where available. Only sjudies 
using an Advanced Test as one of the Predictors and involving 



students at least some of whom entered graduate schoot as late as 
1956 are included. Three persistent Problems make it difficult (o in- 
terpret validity coefficients: 1) smalt samples o) students. 2) inade- 
quate criteria o) success ^n graduate schoot. and 3) restrrctfon of 
the range of measures for both predictors ajid criteria. {A full dis- 
cussion of vatidilyr is Presented in Ctiapter 6.) For each vatidfty 
study, the student group involved is described in terms of sizet insti- 
tution, year, and other pertinent characteristics, as available. The 
predictors and criteria and the relations between these variables 
are reported. 



ADVANCED BIOLOGY TEST 



Content 

To cover the broad field of the biological sciences, the subject mat- 
ter is organized into the three major areas of cellular and sub- 
ceNuiar. organismat. and population biology. 

About one-third o( the examination ^s concentrated on the ma- 
terials and phenomena found at the subcellular and cellular levels 
of organization. "SubcelJuJar ' is defined operatior^aJly to include 
atomic and molecular species, macromoiecules, and such struc- 
tures as cetJ organelles and viruses. ' Ceitular. ' also defined opera- 
tionahy. includes Unicetlutar organisms and the distinctive cell 
types of muiticeiluJar organisms. Under this general heading, 
consideration is given to the chemistry and Physics of the atoms 
ar^d molecules jund in biological systems as well as to their func- 
tions and architectural involvement. The energetics of subcellular 
and cellular levels rs included, emphasizing photosynthesis, 
synthetic and degradative pathways, and maintenance needs. 
Homeostattc mechanisms are exammed from the metabolic and 
Sitmutus-response asPects. Repliratrve Processes and the means of 
transmission of information for them are considered, and some at- 
tention is given to techr>iques of study. 

The organismal biology questions are concerned with the biology 
of multicetluJar organisms as mdividoa/s. The questrons relate to 
the genetic and environmental control of growth and developments 
structure and function, and behavior. Oeveiopment includes those 
processes from fertilization through organogenesis to postem- 
bryonic developments metamorphosis, senescences regeneration. 
Jife cycles, and transmission of hereditary characters. StruCluraf 
and functional aspects include homeostatic mechanisms at the 
tissue and organ levels and hormonal and neural integration. Be- 
havior encompasses reflex mechanisms, spontaneous activity, in- 
nate and motivated behavior, biorhythms. maturationai changes, 
and various forms of learning. 

The population biology qufBzttoo^ deaf wrfh popul^uons and t^osr 
responses to environmental factors and genetic change Included 
are the systematics Of Oroantsms and ecosystem structure arid 



function, with consideration of such topics as energy flow, material 
cycling* community homeostasis, and the ecological impact of 
human activity. Other aspects of Population biology considered are 
population genetics, population behaviox. the evolutionary se- 
quence of organisms* and the mechanisms by which evolution has 
occurred. 

Smce certain abilities are judged important in undergraduate 
btology curriculums. consideration is given to evaluating the 
student's 

• understanding of (a) the historical development of basic bio- 
logical conp^epts and (b) scientific processes and methods of 
investigation, including recognition of the tentative character of 
much scientific knowledge: 

• ability to appty ttie techniques and methods of biofogicaf science 
to the interpretation of laboratory and field situations and basic 
research findings; 

• ability to use resource material, evatuate unf^^miliar materiaJp and 
establish relationships between the contributions of biological 
science and those of other disciplines. 

Responses to Background QuestionSi 1970-71 
(N = 13.496) 

A. At what pOmt ore you in your studies'? 

{U t am or have just completed my junior year of under- 
graduate study^ 
67°o (2) I am an undergraduatesenior. 

20°o (3) I have a bachelor s degree but am not presently enrolled 
in graduate schoot. 
d°o {4} I am in or have just completed my iirst year of graduate 
study. 

r'fl (&} i am in or have completed my second yaar of graduate 
study 



68 



ERLC 



B. What graduate degree do you miend to seek? 

4% {^) I do not plan to pursue graduate study^ 

3% {2} I plan to pursue graduate work but not to obram a 

graduate dc^r^e. 
33% (3) 1 plan to obtain terminaJ M.A., M.S., or other degree at The 

master s level 

31% (4) t plan to obtam M.A., M.S.> or other masrer's level degree 
leading to a doctoral degree^ 

2a'/o (5) I plan to obtain Ph.D.- Ed.D.. or other degree at rhe doc- 
toral Jevei. 

C. HOW many semester {quarter) courses of chemistry dici you 
take as art undergraduate^ 

1% (1) None 

11% (2) Two Of fewer 

44% (3) Three or four 

30% (4) Fiveoisix 

14% {^) Seven or more 

D. How manrf semester (quarter) courses ot physics did you take 
as an undergraduate'^ 

1ff*/c (1) None 

ie"/*> {2) One 

4^/o (3) Two 

12% (4) Three 

S^/a (5) Fou' or mere 

E. How many semester (quarter) courses ot marhemarics did 
you take as an undergraduate'^ 

5% (1) None 

(2) One 

(3) Two 
2(r^ {d} Three 

19% (5) Four or more 

F. If you were art umtergraduate bioioqy major, in whtch of the 
folJowing areas did you specialize'^ 

3^'*> (1) General Biology 

25% (2) Zoology 

5% <3) Botany 

e% |4) Microbiology 

Id^c (5) Orher 



VaNdlty Data 

1. The subjecrs for a study by Creager (1965) were 460 applicants 
(320 males and females) for National Science Foundation 
feJIowships m i955 and 1956. Th3 Predictors were scores on the 
GflE Aptitude Test (verbal and quantitative) and the Advanced 



Biology Test. One criterion was time lapse between attainment of 
the B.A. and the Ph.D., coded as shown below; 

B.A.-Ph.D. Time Less No Ph.D. by 

Lapse (in years): than 4 4 5 6 7 8 9 Aug/64 
Coded Variable: 1 £ 3 4 5 6 7 8 

A second criterion was the dichotomous variable of attaining or not 
attaining a Ph.D> by August 1964. The third criterion was the 
dichotomous variable of attaining or not attaining a Ph.D. in the 
average time taken to attam a Ph^D. in ihe field. The relationships 
between predictors and criteria are Shown in Tables 1 and 2. 

Table 1 : Validities of GR£ Scores against Doctorate 
Attainment for 320 Males Who Were Applicants for 
National Science Foundation Fellowships In Biology 
In1955and1956 



Prodi ctors 


Cfii«rha 


Refi«ciad e.A.'f>h.0 


. Ph.D. by 1964 


Ph.D in 
Av«rag« Tini# 


Point 

BiMriAi Bi»«r^ai 


POlnl 

aiuriai BiuhftI 


GnE Vsrbal Abflity 


2^ 


20 26 


20 .26 


GRE0k>»nlit«1ivs AbilMy 


.23 


.21 27 


21 .27 


GRE Adv«Jic«d aiDiogy 


IB 


U .18 


.17 .22 


Co">Oo$ir« 


26 


.23 29 


23 3G 



Table 2: Validities of GR£ Scores against Doctorate 
Attainment for 140 Females Who Were Applicants fof 
National Science Foundation Fellowships in Biology 
In 1955 and 1956 



Pf«<liCior» 


Criteria 


Rsriect«<l BA.^Ph 0. 
Timd LdpSe' 


Ph.D. t>y 1964 


Pn.O. in 


Point 
Bit«rM BiSttnal 


Pom" 

Bisanai 8t»arj«i 


GRE Varbal Ability 


14 


06 09 


u 22 


GRE OuanlitaliveAbiNty 


.20 


11 17 


22 35 


fiRE Adv«nC«(l Biology 


.23 


.17 26 


.23 .37 


ComPotile 


2S 


.13 29 


.26 il 



'Corrvlationl between lha codad ^anabie lor e A.-Pn 0. tim« lapse giv^n abov« and Iht 
C^rodiCtors With the signs reversed 



2. Roberts 09^^^) studied the records o' d^ students who had 
er^rolled at Wake Forest University from June 1964 to June 1970 for 
graduate study <n biology and who had completed at least nme 
hours ot graduate work. The correlations between graduate grade* 
point averages and ORE scores were .24 for verbal ability. 27 for 
quantitative ability, and .36 fcr Advanced Biology. 



69 



ADVANCED CHEMISTRY TEST 



Content 

The content of the f^xannination emphasizes the four fields mto 
which chemistry has been ^raditionaMy divided and their interrela- 
tionships. An outline of the material covered by the test foNows: 



I ANALYTl6/^L CHEMISTRY 15percent 

A. Classical quantitative area 

Titrimefry; Separations, including theory and applications 
of chromatography as well as gravimetry; data handling, 
includfng statistical tests {x, F. Q. chi-square); standards 
and standardization techniques 

B. Instrumentation area 

Basic etectronics: electrochemical methods: spectro- 
scopic methods,, including mass spectroscopy and those 
in the electromagnetic spectrum from high^energy nuclear 
processes of radioactivity to nuclear magnetic resonance 



IK JNOHGANIC CHEMISTRY 25percer>t 
A, AtooTic theory 

Eleirientary particles, atomic structure, classical expert- 
ments 

8. The nucleus 

Binding energy, abundar^ce ar>d stability of nuclei, 
isotopes 

C Extranuclear structures and related properties 

Electronic dtslrfbutions in atoms, periodic cJassiffcetiOrrs, 
properties dependent on extranuclear structure 

Chemistry of the families of elements 

PreparationSr reactionSr properties, and iiriportant applica- 
tions Of the elements and their compounds stressir^g 
family retationshtps and dependence on extranuclear 
structure; families of representative elements. famiJies of 
transition elements, lanthanides and actinides 



\\\. ORGANIC Chemistry 30 percent 

A. Principal reactions ol sample tunctional groups 

Hydrocarbons. aJcohoJs. aJkyf and ary/ halides. organo^ 
metaMic compounds, carbonyl compounds, conjugate 
unsaturated carbonyl compounds, amines, diazonium 
compouno^H acids, phenofs. simple suUur-cOntaming 
compounds 

6, Structure and mechanism 

EJectronJC structures, isomers and stereochemistry, 
theorettcat concepts basic reaction r^echarnsms, 
structufai tnterpretatpon of specfral (uitravtO^et, infrared, 
nuctear magnetic resonance^ data 

70 



C. More advanced topics and special topics 

Laboratory topicSn ctassical reaction types, classicaf rear- 
rarigennentsH diffen^^ntlations by chemical testSr special 
reagents^ bifunctional coiDpoundSr polymerizations, 
natural productSr comparisons of reactivity. biochemicaMy 
related topics 



IVI PHYSICAL CHEMISTRY 30 percent 

A. Classical and statistical thermodynamics 

Equations of state: first, second, and third laws: E(U^t H. S. 
G. ^. Ct,T C^; phase equilibria; equilibrium conditions; 
Nernst's equation: efementary statrsticai mechanics 

0. Quantum chemistry and Spectroscopy 

Energy levels end wave functions for atomic and 
molecular electrons, harmoriic oscillatorSr rigi^d rotors, 
and translational motion: selection rules: microwave, 
infrared, visibler Raman, and nuclear magnetic resonance 
spectroscopy 

C^ Kinettcs and other topics 

Eiementary Kinetic theory of gases: rate laws and 
mechanisms: crystallography; dielectric properties: elec- 
trochemistry; surface chemistry;^ polymers; chemistry of 
solutions; applications to biological systems 

Each form of the examination samples widely among the topics 
listed above, but questions on all the topics are not in every exami- 
nation. 

Responses to Background Questions, 1970-71 
{N = 5,126) 

A, At what point are you in your studies? 

3% (1) J am m or have just completed my (urvior year of under- 
graduate study. 
65% (2) 1 am an undergraduatesenior. 

17% (3) I have a bachelor's degree but am not presently enrolled 

in graduated school. 
T'/o {4) 1 in or have just completed my hrst year of graduate 
study. 

7% (5) J am in or have compJetec) my seconJ year of graduate 
study. 

8. What graduate degree do you intend to seek'> 

(t) J do not plan to pursue graduate study. 
2^/0 (2) I plan to pursue graduate work but not to obiam a 

graduate degree. 
19% (3) I plan to obtain terminal M.A.. M.S.. or other degree at the 
master's level - 

24% (4) I plan to obtain M A.. M.S.. or other master s Jevel degree 
leading lo a doctoral degree 

SO^o (5) I pfan to obtain Ph.O,. Ed.D., or other degree at the doc- 
toral level. 



Validity Data 

1 A group studied by Rock (1972) mciuded 643 applicants for Na- 
tional Science Foundation leliowships Mosl applied for first-year 
NSF fellowships tn 1958-61 The predictors were scores on the 
GRE Advanced Chemislry Test and GRE Aptitude Test (verftal and 
quantitative), unddrgraduate grade^point average^ and an average 
rating of reference letters. The criterion was attainment of the doc- 
torate by June 1968, The group was '^to random halves the 
va»W(tycoetf(Crents for each hatf are shown m Table 3 

Tablft 3: Validities Using the Criterion of Attainment of 
the Doctortite tor 643 National Science Foundation 
Fettowship Applicants in Chemistry In 1958-61, 
Split into Two Random Halves 





n 


- 322 


n 


--- 321 




f-bifflriBl 

Correlation 


PredjclDr 
Pvi^ormanc* 


r-Dil«riAl 
Correlation 


PrvdiClor 
^flfrormant* 




with 


SiAfidAra 




SiandArd 


Pr*diCtori 


CnieriDn 




CnterkCn 


Mean Oaviahon 












Chtmislr^' 




6741 iiai 


0 4a 


66 27 i:?3i 


ORE VvrDdi 










AbihTy' 






0 23 


5a 40 lO 75 












Ability' 


0 


69 2^ lO 70 


0 3^ 


67 96 lO 7f? 














Q2? 




0 36 


247 93 4^ &2 














0 y> 


4 1 i6 ? &9 


0 33 


42 ?3 9 i5 



'Scaled icCf e wiTn Ihv trikri:: digir ^roP!^«d 
•OnaTour.pOinTiCatemuFtiCl'ed by tck> 
'ZefO ro Cmuir'^ned by tOO 



2. The subjects for a study by Creager ( 1965) were 660 applicants 
(500 males and 160 females} for National Science Foundation 
fellowships m 1955 and 1956. The predictors were scores on the 
GRE Aptitude Test (verbal and quantitative) and the Advanced 
Chemjstry Test. One criterion was lime lapse between attamrnent of 
the 8 A. and coded as shov^'n oeiow 

B A^PhD Time Less NoPh.D tiy 

Lapse (<n years) than 4 4 5 6 ^39 Aug. 54 
Coded Vari;ib!e i 2 3 4 5 6 7 8 



A Si^corid criterion was the diChoiomous variabie of altaming or not 
attaming a Ph D by August 196^ The third criterion was the 
dichoiomous vartabie of attaining or not attaining a Ph.D. m iho 
average time taken to attain a Ph.D. in the field The relationships 
between predictors and criteria are shown m Tables 4 and 5 



Table 4r Valldttjes of GRE against Doctorate Attainment 
tor 500 Mates Who Were Applicants tor National Science 
Foundation Fellowships In Chemistry In 1955 and 1956 



Pr^^itTOiS 


Criteria 


RATifrClwJ 6 A -^n D 
Tima i.api«' 


f>r> 0 by I9fr4 


0 in 
Avtrag« ^^m9 


Poirit 
Bnvnal Bisanal 


Poim 
Bisartal 


Bi4«riai 


vvfbai Ability 


16 


12 IS 


t3 


16 


GRE Ouamitaii^e At>iiiiy 


26 


21 26 


22 


2a 


GRE Adi^anced Chflfviisin^ 


3a 


3< 39 


2A 


43 


Cornpoiite 


39 


3: 41 


35 


U 



'Ccrraivltons bvtwMn the coded vdnabFo Tor B A -Ph O iktib lfp»4 Oivan a&0v4 Ar>d th« 
prediClDrf witr^ tn4 Signi r^ar3«d 



Table 5: Validities of GRE against Doctorate Attainment 
tor 160 Females who Were Apptlcants lor National 
Science Foundation Fellowships In Chemistry 
in 1955 and 1956 



HfflCliCtOfS 


Cnlane 


Rari«Ci«d e A ^Ph 0 
Timfl Lepae' 


Pr> C by 1964 


Ph 0 irt 
Av«rag9 Tjrr>9 


Pomi 
Biivnai 6iSftriai 


PomT 
Biatrial Bksvnal 


ORE verbal AbihtV 


15 


T7 25 


17 25 


GRE QLiaolrTaiiva Ability 


29 


27 39 


27 39 


GRE Advanced Chemiitry 


38 


37 V 


37 5i 


CompoAii4 


40 


33 33 


33 55 



'Correlations b«lw4«n (hP c?d*d variabi« tOr B A .Pn D dme laose Qivtn abovt and tr»4 
[)r4<liClori wiih inesign? rfi^tirseo 



3. Roberts {^970) studied the records of 31 students who had 
enrolled at Wake Forest University from June 1964 to June 1970 for 
graduate study in chemistry and who had completed at least ^ine 
hours of graduate work. The correlations between graduate grade- 
Point averages and GRE scores were .55 for verbal ability. -.07 for 
quantitative ability, and .11 fOr Advanced Chemistry 



ERLC 



71 



ADVANCED COMPUTER SCIENCE TEST 



Con ton t 

The approjxjmate <Jjstri&utjon of questions in each et^iUon of the 
test according to content categories is indicated by the following 
outline. The items in parentheaaa are intended to be examples ol 
topics under the headings, not exhaustive lists. The percentages 
given are approximate; actuat percentages wilt vary stightly from 
one edition of 'the test to another. The issue of how to balance 
theory versus practical applications is a Dfoblematicai one and *s 
not yet completely resolved, 

1 Programming Systems and Methodology 40P*Ycent 

A. Programming Languagesand their Processors 

{Evaluation ol exprassionsp block structure, parameter 
passtngand binding, controt structures, assemblers, com- 
pHerSh interpreters) 

B. Programming Concepts 

(Iteration, recursion, modularity, abstraction, refinement, 
verification, documentation) 

C. Properties of Algorithms 

cTrme and space requirements of programs, especially of 
common processes such as sortir^g and searching, cor- 
rectness of programs) 

D Data Structures 

(Linear data structures, list structures, strmgs. stacks, 
queues, trees) 

E. Operating Systems 

(Scheduling, resource and storage aMocatiOn, interrupts! 
synchronization, addressing techniques, file structures, 
editors, batch/time sharjng, networks/communications) 

K, Computer Systems 20 percent 

A Logjc Design 

(Switching algebra, combinatorial and sequential net- 
works) 

B. Implementation of Computer Arithmetic 

(Codes, number representation. add/subtract/mu1tipfy/di< 
v(de} 



C Processor Organisation 

(instruction sets, registers, data and control flow, storage) 

D System Architecture 

(Configurations of and communication among processorsp 
memories- and I/O devices) 

III. Theory of Computation 15 percent 

A. Automata Theory 

(Sequential machines, transitions- regular expressions. 
Turing machines, nondetermintstic finite automata) 

B. Ar-aiysis of Algorithms 

(Complexity of specific algorithms, exact/asymptotic/lower 
bound analysis, analysis of time^space compfexity. cor- 
rectness) 

C. Formal Languages 

(Regular and context-free g ra mm ars/t an guages- simple 
properties such as emptiness or ambiguity) 

IV, Computational Mathematics 20 percent 
A, Discrete Structures 

(Logic, sets, relations, functions. Boolean algebra, tinear 
algebra, graph theory, combinatorics) 

S, Numerical Mathematics 

(Arithmetic, number representation, numerical algorithms, 
error analysis, discr'jte probability, and elementary statis- 
lies) 

V. SpeciaJTopics .5 percent 

(Simulation and modeling- data management systems, in- 
formation retrieval, artificial intelligence) 

Since tne Advanced Computer Science Test was only recently in- 
troduced, answers to background questions by students and 
^aNdity data are not avaifable. 



72 



O J 



ADVANCED ECONOMICS TEST 



Content 

Jrte committee responsible for the tes^ has felt that the primaj-y 
concern m graduate school selection is the student s competence 
in the basic skitis of economic analysis rather thar^ his or he; 
knowledge of economic institutions. Thus* the tests content 
specifications have increasingly emphasized basic macro- and 
microeconomic analyses. 

tn current edjtions approx*mateJy 60 percent of the test is evenly 
divided betweer) macroeconomic and microeconomic analysis The 
remainder consists of questions on broad topics that might be 
covered in a variety of uppercJass economics courses, including 
money and banking, international economics, labor, inductnal or- 
ganization. pL-blic finance, quantitative economics, comparative 
economic syst4&ms. urban economics, and economic devefopmert. 
Although an individual question may be couched, for ej<ample. m 
termf* of international trade, the information needed to answer -t 
often has been studied m several other courses. 

An important consideration in planntng the content of the test rs 
the emphasis to be given to subjects covered m such courses as 
those (Jescribed above— courses other than introductory principles 
or macro- and microeconomics. Although a good pronort^on c1 
undergraduate economics majors study money and banking, 
international trade, and public finance, it is recognized tr.at 
substantial emphasis upon any such topics would penalize those" 
who have not taken courses devoted to them. Since the preparation 
Of students varies, it is expected that each score will be evaluated m 
tight of xJne student s record as of the time he or she takes the test. 

Although questions in quantitative economics are generafly 
difficult for many students currently taking the test, some such 
questions are included because of the increasing importance of tha 
subject and its relevance to success in many graduate schools 



Responses to Background QuestlOT>s. 1970-71 
(N = 4J70) 

A. At what point are you m your studi^s^ 

3% (1) I am in or have lust comp'eied rny lunior year of under- 
graduate Study 
S1°o (2) I am an undergraduate senior 

18^'& [3] I have a oachelor s degree but an? not presently enrolled 
tn graduate school. 
(4) i am in or have |ust completed my first year of graduate 
study, 

S% (51 I am m or have completed my second year of graduate 
study. 

B What graduate degree do yOu tniend to seek'' 

(11 I do not plan to pursue graduate study 

(2l I plan to pursue graduate work but nor to obtain a 

graduate degree. 
(3) I plan to obtain terminal MA . M.S., or other degree at ihe 

master s level 

24*5 (4) I pfan to obtain M.A., M.S.. or other master s tevei degree 

leading to a doctoral degree 
SC^o 1 5| I plan to Obtain Ph.D . Ed D.. or other degree at the doc- 

rorai level 



C. If you are now a college senior, which of the foiiowing best 
describes your educational experience and your plans with 
respect to the graduate study ol ecOT>omics''|lf you are T>ola 
senior, mark 5.) 

34*0 [ 1) I am an -undergraduate major in economics and I plan to 

do graduate work in econonrtics. 
i7°o (2) lam an undergraduate major in economics and I plan to 

do graduate work in an area related \o economics. 
49''o (3) I am an undergraduate major in economics, but I plan to 

do graduate work in an area unrelated to economics^ 
7°'o (4) I am not an undergraduate major in economics, but I plan 

to do graduate work in economics. 
3r>=o (5) Notasenior. or other 

D. If you were not an undergraduate rpajor in economics, in 
which of the foMowing areas did you specialize? 

30% {!) Socia) Sciences (including Business) 
4% (2} Engineering 

(3) BioJogicai or Physical Sciences 
7% (4) Mathematics (incfuding Statistics) 
4% (5) Humanities 

In questions E through ). include in your answer the courses in 
which you are currently enrolled only if you have completed more 
than half term. 

E. How many semester (quarter) courses beyond the lirst 
course have you had in microeconomic analysis (the study of 
individual economic units—markets^ firms, consumers, 
workers)? 

24°'o (1) None 

(2) One 
18% {3) Two 

(4) Three 

6°'o (5) Four or more 

F. HOW many semester (quarer) courses have had beyond 
the first course in macroeconomic analysis (the study of ag- 
gregate economic behavior including monetary theory)? 

23*0 (1) None 

37'a (2) One 

21% (3) Two 

8^^o (4) Three 

6% [5) Four or more 

G. How many semester (quarter) courses have you taken in eco- 
nomic statistics and econometrics? 

37*'o (1) isione 

36'i> (2) One 

16% (3) Two 

5% (4} Three 

3°'o (5) Four or more 



ERIC 



73 



H How many other semester (quafter) courses in economics 
have you had including iheir^troductor/ coursers)? 

Z'ff> (1) Nor^e 

10% (2) OneorJwo 

1S% (3) Three or four 

24% (4) Five or Six 

42% (5) Seven or more 



L How many semesier (quarter) courses in mathematics 
Onc^uding mathematical probability and statistics) tiave you 
taken? 

35% 0) Two or fewer 
34% (2) Threeorfaur 

9% (3) Five 

9% (4) Six or seven 
(5) Eight or niore 



ADVANCED EDUCATION TEST 



Content \ 

Questions are drawn frorii ihe courses of study most commonly of- 
fered. Since the emPhasil is Placed on ihe relationships among Ihe 
content dimensions of eiucalion, the particular paiiern of courses 
students have laken is lyFeFy to be tess crucial thun their ability to in* 
tegraiethoKnowteOg^nd skills they have dainod- 

The test question/ ror the most part, ask the student to solve 
problems using b&ic concepts, knowledge, understanding, and 
abilities from those areas from which the substantive content of 
education is generally drawn — that is. fiistory. philosophy= 
Psychology, and sociology. Various concerns;in educahon are 
considered. These concerns and the relative weight of each in the 
test are (1) educational goals. 15 percent; {2) administration and 
supervision of t*ie schools. T5 percent: (3) curriculum devefoPment 
and organization. 15 percent; (4) teaching-learning, 40 percent; (5) 
evaluation and research appraisal. 15 percent. The following out- 
line provides greater detaff. 

1 Educatjona? joaJs- including {aj the aims of education ar>d the/r 
proponents and justification, viewed philosophically and in his- 
torical perspective: (b) the clarification and feasibility ol a variety 
of possible goa^s of education, with particular reference to 
Physical, emotional-sociai, and intellectual development; (C) the 
role of education as related to a Pluralisttr; society, community 
goaJs. social problems, and so on 

2. Administration and supervision of the schools, includinQ (a) 
sources of influence and authority, viewed historically and Philo- 
sophically: (b) psychologicat considerations, such as grouDing 
for learning and staff and student morale, (c) the teacher's legal 
rights and responsibHities. sociotogtcat considerations, such as 
community characteristics^ needs. asPirations. and role m 
educational planning. 

3. Curriculum development and organization, including (a) evolu- 
tion of tl;^ cirriculum in the schools and Philosophical dimen- 
sions of curriculum issues; (bj curriculum as related to stages of 
growth and development and learning factors; (c) curriculum as 
related to societal demands on education, and so on. 

4 Teaching-learning, including la) the evalutiOn of theones of 
reachir>g-tearning and their reJatronShip to curriculum rypesand 
teaching styles; logical aspects of teaching, tnciuriing defining. 

74 



explaining, questioning* and evaluating claims; (b^ nature of the 
learner, including intellectual, emotiQnal-sociaL and physical 
development: the teaching-learning process, including kinds ot 
(earning, basic concepts arni principles of learning, guidance of 
learning in the classroom: (o) sociological considerations * such 
as Ihe influence of iicciai class stratification on teaching, styles 
of teaching and patterns of social control, and the teacher's role 
as a member of asocial system, 

5. Evaluation and research appraisal, including (a) the justification 
and meaning of research conclusinns and the bearing of evi^ 
dence on educationaf decisions: current trends and issues and 
thetr historical persp ctive; (b) elementary slatisticaL measure- 
ments and evaluation concepts and techniques bearing on the 
appraisal o' methods* individuals, small groups and the broader 
society. 

Students are called uPon to demonstrate their knowledge of facts, 
terminology, theory and concepts- evidence, and professional 
sources at the same i^me they demonstrate their skill in using the 
cognitive processes — thajt is, recall of knowfedge, comprehension, 
application, analysis, and evaluation. A typical question, for 
example, might require the student to make a prediction based on 
understanding of the social structure of the ghetto family. Another 
example would be a question that considers the justification of a 
teacher's particular coLtrse of action by the use of an appropriate 
generalization ot a teaching theory* 



Responses to Background Questions, 1970-71 
(N = 24,179) 

A. At what point are you in your studies? 

r/o (1) I am in or have lust completed my junior year of under- 
graduate study^ 
2^% (2) laman undergraduateseniof. 

30% {3} I have a bachelor's degree but am noi Presently enrolled 

ingraduate school 
2fl% (4) I am in or have just completed my ffrst yt#ar o? graduate 

study. 

14-=.^ {5/ ' am in Or have completed my second year of graduate 
study. 



B. What graduate degree do you intend to seek'^ 

2% 0) ^do not plan to pursuegraduatestudy. 
3% {2) t plan to pursue graduate wotk t>uX not to obt^m a 
graduate degree. 
69% (3) \ plan to obtain lerrT ina! M,a,. M.S., or other degree at the 
masters Jeve!. 

13^^ (4) I plan to obtain M.A.. M.S.. or other master s leveJ degree 
leading to a doctoral degree. 

13% (5) I plan to obtain Ph.D.- Ed.D.. or other degree at the doc- 
toral leveK 

C. In which of the (oMowmg areis did you major as an under- 
graduate'^ 

69% (t) Education (mcluding elementary, secondary, and any re- 
lated subject area special tz at ion) 

4% {2) Natural sciences, mathematics 
11% {3) Social sciences 

7% (4) Humanities, fine arts 

7% (5) Other 

D. In which of the foUowing types o\ mstttulions dtd you do most 
of your undergraduate work? 

26% (1) Four- or five-year college of !ermg primarily liberal arts 
22°^ (2) Four- or five-year cohege offering primarily teacher prepa- 
ration 

37% (3) Stale university 
11% (4) Pr»valoly endowed university 
y/o (5) Other 

E. ^f you are presently working tOvuard a graduate degree m 
education, in which of the foJmwing areas are you con- 
centrating your work? 

14% (1) Administration and/or supervision 
24% {2) Curriculum a.id fOStructiOn 
5% (3j Psychologrcai loundations (including educaHOnai 

psychology, human growth and development, mental 

hygiene> etc.) 

2% (4) Social foundations (mclud^ng history, philosophy. ar>d 
sociology of education) 
17% (5) Pupil personnel servtces (including guidance^ sPec^al 
education, etc.) 

R Hew many semester (quarter) courses have you completed m 
the area of administration and/or supervision 

69% i^) None 

7% (2) One 

4% (3) Two 

3% (4} Three 

ffVe f5) Fourorrmore 

G. How many semester (quarter) courses ha^ e you completed m 
the area of curncuiumand instruction'^ 

31% (1) None 

14% (2) One 

11% (3) Two 

a% -'^y Three 
27% Fourormore 



H. How many semester (quarter) courses have you completed in 
IhO area of psycho^ogicat foundations? 

31^0 (t) None 

19% (2) One 

15% (3) Two 

ICT'o (4) Three 

14% (5) Fourormore 

I. How many semester (quarter) courses have you cornpieted in 
the areao' sociaJ foundalions? 

43^^> (1) None 

21% (2) One 

ia=/o (3) Two 

6% (4) Three 

9% (5) Fourormore 



validtty Data 

V The subjecls ol a stu^Jy by ^ckhoff 09^) were 185 secondary 
education majors and 1 1 1 elementary education majors with 30 of 
more quarter hours accumuJated at Winona Stale CoNege. The pre- 
dictors were scores on the GRE Advanced Education Test and the 
Miller Anologies Test and undergraduate grade-potnt average. The 
crfterion was overall graduate grade-point average. Stepwise 
regression analysis showed that the GRE scores added practicatty 
nothing to the prediction tor secondary majorSn and the MAT scores 
added nothing to the prediction for elementary majors. Elimination 
of these predictors then yielded the beta weights and muttiple cor' 
relation coefficients between the two predictors and the criterion 
shovm »n Table 6. 



Table 6: Beta Weights and Multiple validity Coefficients 
Using Graduate Grades as the Criterion for Education 
Majors at Winona State College 







Mu{tl0l« 




oeE 


MAT 


UGPA 


Sflcondlry |n > 185) 






.41* 


30 
51 



'Sigr>Hicani at thA OS i«v«< 
'SiQnrriCAnt at the 01 r«vfii 



2. The group involved in a study by Roscoe erid Houston (1969) 
mcluded 231 students who successfully completed the doctoral 
program in education at Colorado State College and addftional 
21 students who vwere admitted, completed 30 quarter hours, and 
then were dismissed. The predictors included scores on the GRE 
Advanced Education Test and the GRE Aptitude Ter.l (verbal and 
<]uantitatlve). The ctiteria were graria-poinl average in doctoral 
studies, graduation versus dismissal from the programi a normative 
judgment, and an ipsative judgments To secure the normative judg- 
mentSn test scores and other predictor variables oo 30 representa- 
tive students not identified by name were presented to 16 graduate 
professors. The professors rated ttie students' prospects as doc- 
toral students^ To get Ihe ipsative judgments, the same 1S 



75 



professors wero given the names of doctoral graduates The 
professors rated th? pfotesional prOrnise of 10 studer^is or> the l>si 
whom they kr>©w. The results are shown \n fable 7 



Table 7: Validities for 252 Students in Education 
et Cotorado State College 









Gr^o^-^Oini 




NormBiivfr 


Judgment 


GRE AdvanCMl Education 






17 


30 


ORE V«rD4l1 At>iji1y 










GRE Ouaniitdltv* Abiipty 


2' 


2b 


77 


17 



3 tn a third study, by Willtams. Harlow and Grab 1 1970), the Sub- 
jects were 64 students admitted lo the doctoral program in the 
Education Department of the University of North Dakota between 
June 1962 and June t967 The predictors included scores on the 
GRE Advanced Education Test. tf!e GRE Aptitude Test, and the 
MiJier Analogies Test, grade-point average for the bachelor's 
degree, and grade-point average for the master s degree. The cri- 
teria were doctoral grade-point average and graduation vs non- 
graduation As of February 1969, 33 of the students had graduated 
and 51 had not graduated or been in attendance during tfie preced- 
ing 21 months. The correlations between Predictors and criteria are 
shown in Table 8. Tabte 8A gives the means and standard devu- 
tions of the predictors and Criteria 



Tabte B: Validities for B4 Students in Education at the 
University of North Dakota 





Cntarrft 






Graduatponv^ 


f r«dpClQ;S 


Doclorfi] Grades 


Nangradudficn 


GrE Advancwj EducalpOn 


oa 


31' 


GRE Verbal Atnl^l¥ 


01 


08 


GRE QiiintiWwt AOi}\W 






MAT 


03 


10 


Gra0» \or 9fich«lor s Ovgree 




0? 


Grad«3 lot Mas(ar'sO»gf«e 




22' 



'Sp^nttpcant a( Ihfi Oiierei 
'Sipni^cant aUhe OUevel 



Table BA: Means and Standard Deviations of Predictors 
and Criterion for Students Graduated and Not Graduated 



PrftdlCtOr 


CrpterpOr 


Oraduattid 

fn r 33j 


NolGrdduatou 


M 


SO 


M 


SO 


GRE Advanced Educa^on 






59 




36 


GRE V«rt>aUt>p|piy 






7S 




71 


GRE Qua nti lAtkv e A blip ty 






101 


504 


71 


MAT 




5) 7 


8 5^ 




11 4 


Gr*de* tor ench^lOM Oe^jre* 




2 75 


39 




39 


Grad«3 lor Mssmr sO«9r0e 




^54 


5J 


3 43 


25 




ODCtOr^i Grades 


3 73 


31 


3 51 


7S 



In addition to the six predictors listed above, nme other predictors 
were used. The 15 predictors gave a multiple correlation of ,51 for 
doctoral grades and tor graduation vs nongraduation. The last 
multiple correlation was significant at the .01 level. 



ADVANCED ENGINEERING TEST 



Engineering is an extremely broad discipline. The subdisciplines of 
chem»c&l. civil, electrical industrial, and mechanical engineenng 
are atl reoresented on the commttlee of examiners and are included 
in Ihe test The aim is to ask engmeertng questions that are 
sufficiently fundamentat and generai so that ah engineers. regard- 
Jess of tN^ir specialty, can reasonably be expected to answer them. 
Since mathema'.ics is basic to all branches of engineerinrj. a 
substantial number of questions are devoted to mathematjcs. Two 
subscores. engineering and mathematics usage, are prov-dod. 

Content 

Questrons for the engineenng subscore are based on material com- 
mon to the several brancr.^s of engineering and usually siudted 
durrng the first two collegiate years. The- areas from which ques- 
tions may be drawn are as foltows 

Mechanics: statics^ dynamics. kinemjticSn strength of materials, 
thermodynamics; fluid mechanics, transfer and rate mechanisms, 
heal. masSn momentum: structure of matter; electricity: chemiJirv: 
nature and properties of matter, including the particulate, tight and 
sound; computer fundamentals: engineering judgement. 



Approximately 80 to 90 questions on these topics make up this 
part of the examination. 

Questions for the mathematics usage subscore have been de- 
veloped from two viewpoints: 

1. There is a body of intuitive mathematical concepts — in contrast 
to facts arid formulas— that forms the basis upon which persons 
select from several possible approaches that one best fitted to a 
particular situation Ifiey have encour^tered in their discipline. 

2. There is a body of Knowledge of mathematical facts that should 
be at the fingertips of those who use mathematics— facts for 
which they cannot always seek verification if they are to work 
efficiently in their discipline. 

The mathematics background assumed iS not more than two 
courses in calculus with some simple ideas from linear algebra and 
probability — ideas that usually precede or accompany introductory 
calculus. This subscore is based on the following kinds o1 ques* 
tions. mtuitive calculus problems, factual recat' questions, aod a 
limited number of other types of mathematics questions. 

The intuitive calculus questions consist o1 three sets ol ques- 
tions, approximately eight in each set. involving graphs of func- 



76 



tions. In each s6t the students must use basic calculus ideas to in- 
terpret the given graphs and derive information concerning graphs 
or paths that are not drawn and * lat they construct for themselves 
from information available in the given figures >n order to answer 
the questions. 

In addition to the questions jn ttie two mathematics usage pans, a 
few mathematics questions appear in the engineering parts of the 
test. Scores on these are part of the mathematics usage subscore 

Responses to Background Questions^ 1970-71 
(N 7,858) 

A. At wnat point are you in your studies'^ 

(1) I am in or have just completed my jun jr year oi under- 
graduate study, 
52°/fl (2) I am an undergraduatesemor. 

23% (3j t have a bachelor s degree but am not presently enrolled 

in graduate school. 
1 1% (4) r am or have just completed my first year oi graduate 

study. 

(5) t am in or have completed my second year of graduate 
study. 

B. What graduate degree do you intend to seek'^ 

5% (1) i do not plan to pursue graduate study 
2% (2) I plan to pursue graduate work but not to obtain a 
graduate degree. 
52% 0) * plan to obtain terminal M A , M S . or other degree at the 
master's tevei. 

25^ (4^ I plan to obtain M A . M S . or other master 3 level degree 
leading to a doctoral degree 

14% (5) I plan to obtain Ph D . Ed D.. or other degree a1 Ihe doc- 
toral level 

What IS the branch of engineering ii which you are presently 
registered or were most recently registered > (Mark one space for 
Question C or D to answer thfS Question aCf;ording to the following 
code.) 

C. 

12% fil Chemical engineermg 

12% {2} Civil engineering 

34% (3) Electrical engineermg 

3% (4) Industrtal engmeenng 

20*^^t> (5) Mechanical engineering 



D 

1"o (1) Agnculturai engin^enng 

6^<T (2) Aeronautical engineering 

To (3) Metallurgical engineermg 

T'o (4) Nuclearengineering 

|5} None 01 the above m either (C) or (D) 



Validity Data 

The subjects lor a study by Creager {1965) were 3Q0 male appticants 
for National Science Foundation feitowships in i955 and t956. The 
predictors were the GRE Aptitude Test (verbal and quantitative) and 
the Advanced Engineenng Test. One criterion was time Japse 
between attainment oi the B.A. and the Ph.D.. coded as shown 
teiow. 

B.A. 'Ph.D. Time Less No Ph.D, by 

Lapse (in years} than 4 4 5 6 7 6 9 Aug. '64 
Coded Variable 1 2 3 4 5 6 7 e 

A second c'itenon was thedichotomous variable oi attaining or not 
attainmg a Ph.D. by August 1964. The third critenon was the 
dichotomous variable of attaining or not attaining a Ph.D. in the 
average time taken to attein a Ph.D. in the field. The relationships 
between predictors and criteria are sh^wn m Table 9 



Table 9^ Validities of GRE against Doctorate Attainment 
for 300 Males Who Were Applicants for National Science 
Foundation Fellowships In Engineering in 1955 and 1956 





Cnieria 


RufieciAdB A.Ph 0 
Time Lapie' 


Ph D by 196J 


Ph D tn 
Av^t&QO Tirno 


e^Hrinr 


Biunol 


eiHnal 


Biienaj 


GRE Verbal Ability 






41 


2B 


*2 


GRE Ouantifshve Abililv 


?i 


Si 


31 


19 


2$ 














Engm«cring 


33 


31 


45 


31 


46 




3S 


34 


50 


34 





'Co^relaiiona betAcen the c<X3#3 war^abip B A -Rh.D iiitie taPee given above end trie 
DrediCiorsiMith ihesiflns reversed 



ERIC 



77 



ADVANCED FRENCH TEST 



Ramer Than siff^ss any one can tent the commiMee responsibfe 
i^jr^ tho resT ijies to achieve a balanced approach The test encom- 
passes Quesiions on reading comprehension, literary mlerpreiation 
(iMfi criticism, literary history, and culture and civilisation 

The tfi^zlrons 0t> rG30mg compr^henSfOn and or; iitera'Y ir^- 
tt-rpr eialiOn and cr:licism are designed to lest comprehension on a 
/affSiy ot levels Qaestions deal with vocabulary recognition and 
use of contoKl 10 determine meaning, sensiltviiy lo style and literary 
uaiLie<i. and ability to follow i^e develoqmt. ' o* an author 2 
thought Prose arid poetry selections represent v.nous periods and 
gorrres from the sixteerith century the present TexCt vary 
considef^abiy in length a& wetf as m ihe number ot questions based 
or^ ti^em 

Inthesor on on literary history which mciudes the Middle Ages, 
questions r^Quin^ specific informai'on on maiOr works, authors, 
[fe^rds anC rnovemerris an:J the abrJity to grasp s^gndiCant reJa- 
licn.ships Q^jesiions ^ipout French culture and civihiation touch 
upon such topics as geography, history, institutions, and the arts 
ind sciences Olhor questions concern definitions of the genres, 
rheif evaluation Ihe vocabulary of rhetoric, and the ideas pro- 
liiOundcd in the last 30 years by the new cntics, nOveirsts, ahd 
dramatists in tms way. the committee attempts to recognize the 
cr t'Cai ^ii-iproache£ to literature that are becoming more prevaien' 
without neglecting the broad ^voiuhonary perspectives sttll con- 
■^idered valid 

Literary selections and questions Qive approximately equal atien- 
r?on 10 ai\ centurres irom the sjxieehth through the twentreih. 
Knowledge? of the iangua9e (grammar, idioms) is nor tested m ques- 
rijns separate from interpretive reading se^ecrsOns. and hngurstics 
IS not tested at ah 



Responses to Background Questions, 1970-71 
(N 2,472) 



A 



At what pOint are you m yOur studies'^ 



1 am in or have just cOmpteted my junior year o* ur f >r 
graduatestijcly 
(2) I am an undergraduate senior 
20"o {2) I have a bachelor s degree but am not presently enrr-ied 
in graduate sChoOi. 
T^o [4^ I am in or have jusi completed my fi'st year of gfadi^ate 
study 

J'^o 1 5) I am m or have completed my second yeaf of grad >:^ie 
study 

B What graduate degree do you tniend to seek'' 

4^0 MM do not plan to pursue graduate study 
3^0 (2) \ plan to pursue graduate work but not to obhun a 
graduatedegree. 

49^- {3^ I plan to obtain terminal M a . M S or other degree pi ihe 
master s level 

26" o {41 I plan to obtain M A . M S . or other master's level d< u 

leading to a doctoral degree 
i7°o (5) \ plan to obtain Ph D , Ed D , other degree at the kjc 

torai level. 

C Was French regularly spoken m your home when yOu w- ^ 
Child? 

M) Yes 
90°^ (2) No 

D For what length ot time nave you siudied m or Mvec 1 a 
French speaking counlry'? 

2To Not at all 

21°r ./.} Some, but less than three monihs 

8% (3) Three to SIX months 

23°/ij (4) Six months to one year 

15°o (5) More than one year 

E What IS {or was^ your undergraduate maior held'' 

65% (1) French 

12) Another toreign language 
11% O) Other 

F If you majored in French as on undergraduate, wh^ch ' ihe 
toiiowing was most emphastied in your courses ? 

65% (1) French Jiterature 

17*^0 (2) French language prO'iCiency 

2*0 (3) Civilization ar^d culture (including area studies) 

Z^'o (4) Lin9Uistics {history ot language, structure of lanQuii jt.i 

3^'> (5) Other 



78 



ERIC 



ADVANCED GEOGRAPHY TEST 



Conteni 

The truest lOns in tho Advancea Geography Tesi are drawn from the 
Course<i of ^>tui:jy most common ty Offered Approximately 40 percent 
of the questions are devoted to physical geography ar>d 60 percent 
to human geography Questions on Physical geography deal with 
SuCh lopics as cNmaie. fandforms. biogeo^raphy, vegetation, soils, 
the en^/ironment as a system, water^ and cartography. Questions eri 
human geograpny cover such areas as economic geography, 
resources, transportatron, trade, settlement. an<i population. Some 
questions require knowledge more than one area or more than 
one aspect of geography Increasirig emPfiasis >s hemg placed by 
the committee responsibfe tor the test on mCludmg questions that 
call ^or apoiicarion t>y the students of iheir powers ot reasoning and 
analytical st^ills » aiher than merely their caoac^tv tor recall 

R«8ponde£i Xo Backgrounci Questions. 1970-71 
<M 962) 

A Ar whr"it [io»ri! ;^r*^ you tn ^(.>ur Studies'^ 

< \) \ ^.f} in yr h^ve just COinpleted my ■ w>.^r of under- 

qrf"!duare study 
53° i'K'^ : arri an undergraduate sernor 

2'\^z> i3j [ have a bachelor s degree but am not presentf^ , ned 

m graduate schoof 
T? am in Or t\f^ve. iuST ccmpieti?d my ^rst ye^^r of graduate 

10"- Si ! .VT^ If) 0' fvwe coif'pletetl rf"ky ^locond vear of gradu-^te 

B Wf»(^t grddu^ir^? (.legree <i''i y0(.i intend \o ^G&u "-^ 
i'^ ■ >i I dci not ptan to pursEje graduate studv 

V'- 1 ?i I pMf> 10 oufSiue qraduat*^ wOrk but no! (o Obtain a 
qrriduat^ degree 

47'^ 3j I oJafi to oblahn terminal m a MS Dr other degree aithe 
m^YSter s level 

35'- 1 'I I t pian to ot^ram M A MS or ofhe-^ n^aster s level degree 

k\iding \o a doctoral cJ^gree 
14 '.^1 1 [jij^Tr to obMin Ph 0 Fd D or othf^r degree at the doc- 

tu'ai ]t>VBi 



C How many sernester ^Quarter) courses have you had m 
physical geography^ 

30% (1} Qneorfewer 
24" o (2) Two 
2?*o [3] Three or four 
10*^0 {4) Fiveorsix 

(5) Sevenormore 

D How many semester (quarter) courses have you had in eco- 
nomic geography^ 

66"^-? (1) Qne Or lewer 
le^o {2} Two 

i3) Three or four 

{4] Five or sjx 
?*o (5) Sevenormore 

E How many semester (quarter) courses have you had in 
C(Jitural geography^ 

-JTo (1) Qneorfewer 
21% (2) Two 
i9"o (3) Three or four 
10°J (4) Fiveorsix 

(5) Sevenormore 

F Have you had any courses in geographic thought and 
methods'^ 

53^0 (t) Yes 

G Have you had any courses in cartography'^ 

STo {^) Yes 
35% (2) No 



P - . 79 

ERIC 



ADVANCED GEOLOGY TEST 



Content 

Modern geological thinking crosses many subject boundaries, and 
numerous qtieslions in the lest refteci this tendency Nevertheless 
each qu^tion reasonably falls mto one o1 three maior cate<]ories 

A separate subscore is reported ior each of these three cste- 
gortos. A further description of the content follows 

i STRATIGRAPHY, PALEONTOLOGY. AND 

GEOMORPHOlOGV 70 questions 

A Stratigraphy 

B Sed^mentoiogy 

C. Paloontotogy (invertehrare and vertebrate) 

D. History 

E Geomorphology, tnctuding giacioiogy 

F General^ in(;ludmg oceanography 

II STRUCTURAL GEOLOGY AND GEOPHYSICS m quesrtons 

A Structure — field relations 

B Structure — dynamics ^expef imenial and Ihf^oretical) 

C Tectonics 

□ Isostasj . gravfty, and nagnetism 

E Earthquakes and seismology 

F Heat and elecirical properties 

G General, including planetoiooy 

JJI MINERALOGY. PETROLOGV. AND 

GEOCHEMISTRY 60quesiions 

A Mineralogy 

' ChemfCaJ compositions 

2 Physical properties (optical, x-rays, and crystallography) 
B Petrology 

1 Field relations 

2 CorriposrtiOns and mineral assemblages rocks 
C Geochernrstry 

1 Solutions 

2 Phase equilibria 
D Radicmelnc dating 

E GconornrC mineral deposit:* 

The Advanced Geology Test "S designed to [Treasure important 
ab!f'ties, as foUows 

• Ability to analyse geologic phenomena usmg, for Example, maps, 
graphs, cross sections, thin secttons. block diagrams, diagrams 
resufting from mstrumenta^ methods, and perceptions m three 
dimensions. 

• Ability to comprehend geofogical processes, including compre- 
hension through the application of physics. chemistry, biofogy. 
and mathematics. 

• Ability to demonstrate knowledge of basic geologv 



Responses to Background Questionsi 1970-71 

(N - 1.636) 

A. At what poidt are you in your studies'' 

3^0 (1) I am in or have just completed my juhior year o1 under- 
graduate study. 
64^o (2) I am an undergraduate senior. 

15% (3) I have a bachelor's degree but am not presenlly enrolled 

mgraduateschooi. 
9°o (4) \ am in or have just completed my first year o^ graduate 
study. 

8% (5) I am in or have completed my second year ot graduate 
study. 

B What graduate degree do you tntend to seek? 

2% (iH do r^ot p*an to pursue graduate study, 
1% (2) I plan to pursue graduate work but not to obtain S 
graduate degree. 
37^/o (3) I plan to obtain terminal M.A.. M.S .or other degree at the 
master's level- 

37^4j (4) } plan to obtain M.A.. M.S.. or other master s level degree 
leading to a doctoral degree. 

22^.^ (5) f plan to obtain Ph.O.. Ed.D , or other degree a{ the doc- 
toral level. 

C. Did you major in geology as an undergraduate'? 

e7=*.c {^) Yes 
11% No 

D. If geology was noi your undergraduate major, in which of the 
fONowmg f»elds did you concentrate as an undergraduate'^ 

(1) Biology 
1% (2) Chemistry 
4^o (3) Physics 
?c (4} Mathematics 
9^0 (5) Other 

E. With respect to graduate schools, why are you taking this 
test? 

26% f 1) To gair^ admission to a graduate school only. 
T^'o (2) To secure financial assistance from a graduate school 
onfy. 

57"o (3) To gain admission to and secure financial assistance 
from a graduate schooi. 

F Are yOu taking th»s lest m order to secure financial assistance 
from the National Science Foundation'^ 

(1) Yes 
7©=^* (2) No 

Validity Data 

1. A group reported on by the Office of Educational Research 
(1960) was composed of 7$ students registering for graduate work 
in geology at YaJe University in ihe years 1952-1961. inclusive. The 
predictors were scores on the GRE Advanced Geologv''*^... and the 
GRE Aptitude Test (verbal and quantitative). The criterion was a 



composite rating at the j^iudents by faculty m&mbers The correla- 
tiOAS wWh this CfiX^fiOn wOfO 51 tor Advanced Geology, .32 tor 
verbal ability. 38 ror quantitative ability, and S4 for an opt»maMy 
weighted composite of the three predictors 

2. The subjects tor a study by Greater ( 1965} were 119 male appli- 
cants tor National Science Foundation fellowships in 1955 and 
t9S6. The Predictors were scores on the Aptitude Test (verbal 
ar>d quant»tativel and tt>e Advanced Geology Test. One cr»terion 
was time lapse between attainment of the B.A. and the Ph,0., coded 
as shown below 

B.A.-Ph D Time Less No Pt>.0. by 

Lapse (in years) ltian 4 4 5 6 7 8 9 Aug '64 
Coded Variable 1 2 3 4 5 6 7 8 

A second criterion was the diChoiomous variable of attaining or not 
attatning a Ph.O by August 1964. The third Criterion was the 
dtchotomou.t; variable of attaining or not attaining a Ph.O, in the 



average time taken to attain a Ph.D. tn the field. The relationships 
between predictors and criteria are shown m TabJe 10 



Table 10: Validities of GRE against Doctorate Attainment 
for 119 Males Who Were Applicants for National Science 
Foundation Pellowstilps In Geology In 1955 and 1956 



Predictors 


Cr tlflfta 




Ph 0 by 1964 


Av«r«g« Time 


Point 
6l3«riBl Bj^ertal 


Po;at 


GnE v«rb«l Abttity 


.25 




.26 .36 


OnE OuanbUttv« Abittty 


.26 


.27 37 


.2* .33 


G^E Adv«nc«d GMIOOV 


.22 


20 27 


.22 .30 




31 


33 .46 


32 



'C(^rr«i4iiOns bfrrw««n rh« cod«d wiflbio ror B.A -Ph O. uma iap» gtven «bova and 



81 



ADVANCED GERMAN TEST 



Content 

Rather than stress any one content aroa< tlie committee responsible 
for the test endeavors to achteve a balanced approach. The test en- 
compasses questions on German structure and idiomatic usage, 
reading comprehension, literary interpretation and sensitivity, liter- 
ary history, and culture and civilization. A few questions also touch 
on basic concepts of linguistics. 

Prose and poetry seJections, prmc^paJ^y from nmeteenth*cer>tury 
ar>d twer>tieth<cer>tury literature and of various degrees of difficulty, 
are designed to test compreher>sion oi> a variety of levels. In addi- 
tion to measuring accurate comprehension of content, questions 
deal with literary expression, sensitivity to style ar>d literary values. 
Mterary criticism, and the ability to follow the development of ar> 
author s thought. The questions on literary history, from the Middle 
Ages to the present, require specific information on -naior works, 
authors, trends, and movements and the abHity to grasp significant 
relatronShfpSr Questions on German culture and civ^^Mzation touch 
upon history, geography, institutions, science, and the arts. 

Literary selections stress the twentieth century heavtly and none 
are earlier than the nineteenth century; both faction and nonfiction 
are represented. However, questions on literary *acts embrace the 
entire history of Germar> literature from the Middle Ages on. 
Sensilrvity to literary style is tested through specific tiem types. 
Knowledge of grammar and idioms e^od applied linguistics are 
tested sepai^ately. 



Responses to Background Question:}t 1970-71 
(N = 702) 

A At what point are you in your studies'^ 

3^.a (1) I am in or have just completed my junior year of under- 
graduate study, 
€4^0 (2) I am an undergradusle senior. 

ler'/y (3) I have a bachelor's degree but arn not presently enrolled 
m graduate school 
(4) I am in or have just completed my first year of graduate 
study 

f5) i am in or have /ust completed my second year of 
graduate study. 



B. What graduate degree do you intend to seeK^ 

3^'o (1) I do not plan to pursue graduate study. 
3^i> (2) I plan to pursue graduate work but not to obtain a 
graduate degree. 

39% (3) I plan to obtain terminal M.A,. M.S., or other degree at the 
master's level. 

27% (4) I ptan to obtain M,A.. M.S., or other master s level degree 
leading to a doctoral degree. 

26% i5) 1 plan to obtain Ph,0.. Ed.O. or other degree at the doc- 
toral level. 

C. Was German regularly spoken m your home when you were a 
child? 

17^;> (1) Ves 
82% v2) No 

D. Oo you now speak German with native or near-native 
fluency? 

51% (1) Yes 
Ar/o (2) No 

E. When did yOu begin to study German? 



42^t 
27% 
12% 



21% 
14% 

25% 



79% 
4% 
IT'/s 

H. 

t4% 

3% 
4% 



(1) in gradeschool 

(2) In juniofhigh school 

(3) In high school 

(4) Asacoliegefreshman 

(5) As a college sophomore or later 

For what length of time have you stuc^ied in or hved m a 
German-speaking country? 

(1) Notataff 

(2) Some, but less than three months 

(3) Three to six months 

' (4) Six months to one year 
(5) More than one year 

What is (or was) yOur undergraduate major field? 

(1) German 

(2) Another foreign language 
<3) Other 

if you majored m German as an undergraduate, which of the 
following was most emphasized in your courses^ 

(1) Literature 

{2) Language proficiency 

{3) Civilization and culture (including area studies) 

{4) Linguistics (history of language, structure of fanguage) 

<5) Other 



82 



ERLC 



ADVANCED HISTORY TEST 



Content 

Tho questions m the test are drawi^ from the courses or study most 
coTimonJy offered Major coristderattoris that have determined Ihe 
cofitent and form ot the Advanced History Teat are the uses made of 
the lest anc the wide variation irt preparation of (he students taktng 
it 

In otJ^er words, the test provides one measure of the eitpenence 
an undergratjuate major m Mstory has acquired in the discipline of 
history and of the knowledge and abilities required for graduate 
work in history. More specifically, this experience consists of { 1) fa- 
miJiarity with historica* data and (2) the ability to apply knowledge 
gained by thi£; familiarity, particul^irly to perceive reiatioitChips — 
both those involving individuals and movements and those that are 
chronological— and 10 analyze historical material m various forms, 
such as historicat documents or passages from historical works 
The potentiaf graduate student should also have an understanding 
of the meaning and use of sources and the significance of move- 
ments and periods Thus, the test measures fcictual knowJedge r:ot 
tor itself but as it facilitates the understanding of periods, trends, 
and relationships. 

The problem of content coverage m a single history test is com- 
plex. It IS almost impossible to dehmit the field of history m area, m 
time, and in scope. Moreover, no common core ot knowledge is re- 
quired of alJ history maprs in all colleges. In the Advance^ History 
Test, all the questjons refer to the history of the United States and 
Europe (somewhat more questions are devoted *o the latter than 
the former] because these remain rhe areas in which the greatest 
number of students concentrate There may be questions involving 
an understandir^g of the relationships amonp these and other 
areas Similarly, questions deal with economic social. in- 
tellectual — as weM as political --history m about equal proportions 
Atthough the maiOrity ot the questtons concern the period after 
1789. the Mfddte Ages, the Renaissance, and the Reformation are 
also covered 

The commitree responsible tor the test is ac^jtely aware that his- 
tories other than those of Europe and the United States have 
increasingly assumed a greater place in the curriculum, both as re- 
quired courses and eleciives. For marjy years, in fact, the GRE Ad- 
^ftnCed Htstory Test included questions on Asian. African, and 
Latin Amencan history In 1969 the committe reluctantly dec-ded 
to stop including such questtons in the test. Given the limited 
number of students encolled m courses in other than European and 
United States history, it had always been difficult to judge how 
many questions could provide adequate coverage or useful conclu- 
sions. What was decisive, however, was the evidence m the test 
results that the examinees as a whole did pooriy on such questions. 
Afthough the committee woufd fike to have kept them tn the test as 
acknowledgment of the nsing importance of Asian. African, and 
Latin Americar: studies, it was neither fair to the majority of 
students nOr good test-makmg practice to present questions that — 
however oasic— were unlikely to be answered correctty by able 
students w»th extensive backgrounds m the prevailing European 
and Amencan history currtculums 

More recently, on the basis of a course survey available to *he 
committee from a background questionnaire filled out by 
examinees m Advanced History, it was dectded — again 
reluctantly — not to resume testing in Asian, African, anc! Latin 



American history at this time. The commitcec, hOwever. continues 
to monitor curriculum developments to detern'ine whether a 
Change m test cor^tent is warranted. 

History is typical of most Of the Advanced Tes'.s in that it is a 
highly reliabte test but one o] greater than midote difficulty for the 
test population. Attempts to reduce the diffici'ity of the History Test 
are being made through decreasing the r^jading time needed to 
answer the application questions, diminishing the emphasis on 
economic history^ and posing more basic questions on Russian and 
Eastern Europeart history. 



Responses to Background Ouestlons. 1970-71 
(N = 10,637) 

A Af What pomt are you in your studies? 

i 0 I am in or have just completed my junior year of under- 
graduate study. 
62^/o (2) I am an ur^dergraduate senior. 

21*0 (3) J have a bacheJor s degree but am not preser^tJy enrolled 
tn graduate school. 
{4) I am in or have just completed my first year of graduate 
study. 

4''<i (5) I am in or have completed my second year of graduate 
study. 

9. What graduate degree do you intend to seek? 

(1) t do not ptan to pursue graduate study. 
2°4 (2) I pian to pursue graduate work but not to o&tam a 
graduate degree. 
40^0 (3) i Plan to obtain terminal M.A.< M S.. or other degree at the 
master's level. 

31% (4) J pJan to obtain M A,, M.S„ or other master's level degree 
leading to a doctoral degree. 

20% (5) I plan to obtain Ph.D.. Ed,D,, or other degree at the doc- 
toral level' 

In answering questions C through I incfude the courses you are 
presently taking. 

C. HOW many semester (quarter) courses have you had in United 
States history (inCludmg the Colonial period)'^ 

5% (1) None 

25% (2) One or two 

34% (3) Three or four 

21% (4) Fiveorsijt 

15^^ (5) Sevenormore 

0. How many semester (quarter) courses have you had in 
ancient history and medievai European htstory (including 
courses in the history ot individual European countnes)? 

19% (1) None 

Z'T/o {2) One 

24% (3) Two 

13% (4) Three 

16% (5) Fourormore 



ERIC 



S3 



E How many semester (quarter) courses have you had in the 
RohafftSariCd ar\<\ earty modern European history lo 1739 
(including courses m tfie history of individual European 
countries)? 

24% (1) None 

37% (2) One 

23% (3) Two 

9% (4) Three 

e% (5) Four or more 

F HOW many semester (quarter) courses have you had m 
modern Europear^ history from 1709 to the present (including 
courses in ttie history of individual European countries)'^ 

16*^fl i\) None 

29^/o (2) One 

24% (3) Two 

U% (4) Three 

t€% (5) Four or more 

G How many semester (quarter) courses have yOu had 
history of China and or Japan 

6S*'fl (1) None 
20?^o (2) One 

7"c (3) Two 

?=o ^4} Three 

(5) Four or more 

H How many semester (quarter' courses have you had m the 
history of Afrjca (including courses in the t^fSfory of indi- 
vidual African countnesi^ 

64% { 1} i^one 
M% (2> One 

2^0 (3) Two 

1% '■!) Three 

(5) Four or more 

1 HOW many semester (quarter) courses have you had 'h the 
history of Latm America (including courses in the history of 
fndividual Latin Amencan counlnes'^ 

73^<r M) None 

17*0 (2) One 

e^b (3) Two 

2*^c (41 Three 

2*o (5) Four or more 



Validity Deta 

1, A group reported on by Johnson and Thompson (t962) was 
composed of a smait number of graduate students in history at 
Sacramento State College. The correlatmn between the predictor 



of GRE Advanced History Test scores and the CrrterTOn of grade- 
pOtrit averages m all graduate study was 56. The coefficient was 
significant at the .05 level 

2. A group studied by Lannholm^ MarCO. and Schrader (1968) w<is 
composed of 66 students first enroJM in a Particular graduate de- 
partment of history between the f^it of 1957 and June 1960. The pre- 
dictors were scores on the GRE Advanced History Test and the GRE 
Aptitude Test, and undergraduate grade-point average. The en- 
terlon was success in graduate study, defined gs having earned the 
Ph.O, or being stilt enroJied and r .^ed by faculty members as 
outstanding or superior tn the lall of 1963. The resuttsare Shown in 
Table 11. 



Table 1 1 : Validities for 66 History Students 









Standard 




Corralfllion 


PflHorm«nC« 


' D«^iatiOn or 






on 


Parfoirnaace on 




Cntflnon 




Predrttor 


GFi ACfvanCfld Hiltor> 




56S 


98 


Gf^£ Varbai AbiNty 


41 




1 TS 


Gr£ 0g«ntitArv4 Ability 


36 


509 




UGPA 


41 




56 


Optimally W»ighi*<] 








COmbinflfiOn 


59 







3. Another group studied by La^inhotm et al. (1968) was composed 
of 26 students first enrotied in a partfcular graduate department of 
history between thd fall Of 1957 and June 1960. The predictors were 
scores on the GRE Advanced History Test and the GRE Aptitude 
Test. The criterion was success in graduate study^ defined as hav- 
ing earned the Ph 0. Or being still enrolled and rated by faculty 
members as outstanding or superior in the faii of 1963. The results 
are shown in Table "12. 



Table 12: Validities for 28 History Students 







M03n 


Srjridflrd 




Corral^tion 


Perlofmaoco 


OoviVion 01 




with 


on 


PeHorm^nCa on 


PrKliClOf 


Cntarion 


PraoiClor 


Prediclor 


Gne AdvSnCfld Hisiory 


46 




7^ 


GnE vorba) AbilttY 


- 04 






GRE Ouan ti Uitive Abi^i IV 




&49 


101 



4. Roberts (1970) studied the records of 63 students who had 
enrOUed at Wake Forest University from June 1964 to June 1970 for 
graduate study in history and who had completed at least nine 
hours of graduate work. The corretations between graduate grade- 
Point averages and GfiE scores were -.31 for verbal ability, -JB 
for quantitative ' ability, and -^31 for Advanced History. 



84 



V> A 



ADVANCED LITERATURE IN ENGLISH TEST 



Content 

The Test contains qu^sttOns on poetry, drama, biography, the essay, 
criticism, the Short story, the novel, and. 1o a limited extent, the his- 
tory of the language The test draws on Enghsh and American litera- 
ture of all periods: ht also contains a few questions on well-known 
foreign writers and on works, includmg the BiDie. translated from 
foreign languages Throughout, the emphasis "s on major authors, 
works, and movements. 

The questions may be somewhat arbitrarily ciasSifi^*d mtr- r 
groups, tactual and critical. The factual questions \es\ a student s 
knowledge of the major writers usually studied in college hterature 
courses For example, the student may be asked to idenlify a 
writer s ma|Or cOntrit>utiOn to iiterary history, to assign a hterary 
work to the period m which it was written, to identJy the primary 
theme o^ a work, to ident^^y common kmds Of Poetic meter, to 
recognize a literary aiiusion m a given context, to identify a writer or 
work described m a bnef critical comment, or to determine the pe- 
no<S or author of a work on the basis of the style and content of a 
short excerpt The cnticai questions lest theabiltly to read a literary 
text Perceptively Students are ast^od to examine a given passage of 
prose or poetry and to answer questions aboui the author s thesis 
or tdeas and his or her use of figurative language. Such questions 
also deal with form and structure, literary techniques, and various 
aspects of style 

Often examinees wtii fee! the test has discovered and emphas^ized 
areas m which they are least adequate, in fact examinees tend to re- 
member most vividly those questions that proved troublesome 
Students taking the GflE shOutd remember that, m a test of ap- 
proximately 230 questions, much of the material presented no 
undue difiicufty The -jery length and scope of the examination 
eventually work lo the benefit of the students and give them an op- 
portunity to demonstrate what they do know No one is expected to 
answer all the question^s correctly 

Responses to Background Questions, 1970-71 
(N = 14,079) 

\ At What p3int are you in your studies^ 

[11 I am in or have lusr completed my juntOr year of under- 
graduate study 
60*4 {2) t am an undergraduate seniOf 

22?o {3) I have a bachelor s degree but am not presently enrolled 

in graduate school 
lO^'a (4) I am in or have just completed my first year of graduate 

study 

S""^ \5\ t am in or have completed my second year Of graduate 
study. 



B. What graduate degree do you intend to seek^ 

3*0 (1) I do not plan to pursue graduate study. 

2°o \2] I plan to Pursue graduate work but not to obtain a 

graduate degree. 
43"o (31 I p\an to obtain terminal M.S . or other degree at the 

master's level. 

30'='o (4) I plan to obtain M-A., M S ^ or other master's level degree 
leading to a doctoral degree. 

20** (51 I p(an to obtain Ph.D., Ed.O.. or other degiee at the doc- 
toral level. 

C. What was yourundergraduate major'? 

90*0 (1) English 
2=o (2) History or philosophy 
2?^ f3) Sociaiscience 
1*0 (4) Foreign language 

iS) A natural science or mathematics 

D (f yOu studied English in college^ in which of the following 
areas did you concentrate'^ 

59^0 (1) English literature 

17% (2) American hterature 

W<> (3) Comparative Hterature 

1% (4) Linguistics 

4''a (5) Composition and rhetoric 

E. How many semester (quarter) courses m English literature 
have you taken? 

2°'o {}) Fewer than two 

la^-i. (2) Twoorthree 

22^o (3) Fourorfrve 

21% (4) Six Or seven 

42^4 (5) Eight or more 

F How many semester (quarter) courses in American Uterature 
have you taken? 

27**. (1) Fewer than two 
45** (2) Twoorthree 
18% (3) Fourorfive 

6^<> (4) Six or seven 

4% f5) E^ght or more 

G Which of The following best describes a comprehensive his* 
torical survey of English literature (as opposed to a period or 
genre course} which yOu may have taken 'h college? 

54% (1) A two-semester full survey tat least from Chaucer to the 
twentieth century) 

>0°'o {2) AOne-semester full survey 
e=!'6 (3) The first haJf of a two-semester survey 
3% (4) The second haJf of a two-semester survey 

25% (5) No sufv , course taken 



65 



H, In what period has your undergraduate preparation been 
most thorough {tn number ot hours lahen)'' 

5% (!) OldandMiddleEn^ltsh 

(2) Renaissance 

10% (3) Restoration and eighteenth cenl'J^ 

26^/o <4) Romantic and Victorian 

3€P'o (5) Modern 

I what genre has your undergraduate preparation been 
moat thorough (in number ot hours taken)? 

tO*^ (1) Drama 

31% (2i Poetry 

47% (3) Fiction 

4% (4) Prose nonfictjon 

4% (5) Other 

Validity fMn 

1. The grotJp included in a study by Lannhoinr>, Marco^ and 
Schrader (t966) was composed of 98 students first enroMed a 
particular graduate department of English between the tell of 1957 
and June 196C. The predictors were scores on the GRE Advanced 
Literature in English Test and the GRE Aptitude Test. The criterion 
was success in 9raduate study^ defined as having earned the Ph.D. 
or being still enrolled and rated by faculty members as outstar>ding 
or superior in Iholall of 1962. The results are reported in Tabte 13, 



Table 13: Validities for 98 Literature Students 





r-btsonal 




Srandard 








0«vjaijon oi 




v^ihri 


on 


PeHormanCfl On 




Crrlerran 
















n 


TOS 




G^E Virtiii Abitiry 






59 


GflE Ou»nhlari Atndiy 




576 




Opri maJ ly Wetgrt i v) 








CombtftariOn 









2. The group ir> a second study by Lar>nholm et aL (196d) was 
composed of SI students first enroMed in a particular graduate de- 
partment of English between the fall of 1957 and June 19$0. The 
predictors were scores on the GRE Advanced Literature in English 
Test and the GRE Aptitude Test and undergraduate grade-Poir>t 
average. The criterion was success ih graduate study, defined as 
having earned the Ph.D. or being still enrolled and rated by faculty 
members as outstanding or superior in the fall of 1963, The results 
are reported m Table 14. 



Table 14: Validities for 81 Literature Students 





r^b^»nal 


M«an 


Standard 




Corralahon 


P«r1orman» 


Deviation 




Witt) 


□n 


Pariormanceoa 


Pr«di010r 


Crkt«fmn 


Pradictor 


Predictor 


G^E A<]vanC«d LiT«raturfr 








lO Engliah 


43 


€26 


a? 


GRE v«fb«4 Abklkty 


.32 


6U 


102 


QHE Ouantitttbv« Abil.ty 


4S 


490 




UGPA 






51 


Opllmtlly Watghlvd 








Combi nation 


.67 







3. Roberts {^^70) studied the records of 60 students who had 
enroMed at Wahe Forest University from June 1964 to Jun^i 1970 for 
graduate study tn English and who had completed at least nine 
hojirs of graduate worh. The correlations between graduate grade* 
point averages and GRE scores were .17 for verbal ability. .01 for 
quantitative abitity. and .54 for Advanced Literature in Englisii. 



86 

ERIC 



ADVANCED MATHEMATICS TEST 



Content 

The questions tn the test are drawn trOm the courses study most 
common/y offerecJ. ApprOxrmately 50 percerrf of the questions in- 
volve anatysis and its applications— subject matter that can tie 
assumed to be common to the backgrounds of almost all 
mathematics majors. Abour 2S percent of the questjorrs in the rest 
are in linear and abstract algebra. 

The remaining portion consists of a tew questions in each of 
several other areas of mathematics currently offered to under- 
graduates in many institutions. Included are such areas as 
probability and statistics, number theory, set theory, logic, com- 
binatorial analysis, topofogy, numerical af>a/ys*s. and computer 
programming. There are also questions that ask the candidate to 
match "real-life ' SEtuatjons to appropriate mathematical models. 

Because the material in the last-mentmned 25 percent of the test 
is so diverse, no useful purpose would be served m attempting to 
deacftbe any substantial part of it. However, the materia! m an.ilVMS 
and algebra, on whiCh 75 percent of the test is based, is probabfy 
well enough detmed ro make the following somewhat more detailed 
description useful. 

Analysis^ The u^u^^: material of (wo years of calculus, including 
trtgonometry analytic geometry (through come sections and 
guadric surfaces), and introductory differential equations, intro- 
ductory real variable thf^ory, as Presented m courses SLCh as those 
entitled advanced calculus^ or methods of real analysis ' that in- 
clude the elementary topoiogy of the hne. plane. 3-space, and n- 
space as well as Riemann mtegration. and, i( is hoped. Stieitjes and 
Lebesgue integration 

Topics In Algebra, Group labeitan. cyclic), subgroup, normal 
subgroups quotient group permutation g^oup. order (Of group, of 
element). La Grange s Theorem, ring, ideal, integral domain, zero 
d*visor field, polynomial nng^ congruence rriodulo an integer, di- 
vistbil'ty. division algorithm (for integers, polynomials). homomOr- 
phism. isomorphism, and automorphism (for groups, rings, fields), 
sector space, kernel, null space, dimensfon. nnear independence, 
dual space, fnner product space, linear transformation, matnx of a 
linear transformation, characteristic root, trace, determinant, 
matric operalionSn similarity of matrices, spectral theorem for 
normai matrices (possibility of dtagonahzation) 

Neither tne descrtption of analysis nor the usx of topics m algebra 
IS infend<KJ to b:* tjxhaustive Obviously, it is necessary to under- 
stand many other concepts^ but it is hoped the description will 
provide the prospective examinee with a useful idea of the material 
past and present committees of examiners have considered, and 
nOw consider, to be appropriate tor a test designed to measure 
knowledge, skills, and aptitude needed for graduate study m 
mathematics 

It should be emphasized ihar ■ ^owfedge of tne material 
described above is a necessary, but not Sufficient, condition for 
correctly answering the questions in analysts and algebra. Aciualty, 
a substantial number of questions require no more than a good 
preCalcufus background, and, in general, questions are intended 
to test more than sti-aight knowledge and. indeed, concentrate 
on testing (1) understanding of fundameniai concepts and (2) the 
ability to Choose among and aop'y these concepts in novel 
sttuaiions 



A substantial number of "insightfur questions are inctuded 
Such questions have at least two avenues of approach, one obvious 
and requiring tedious manipuiations. and the other not at all ob 
vious but requiring little, if any. computation or manipulation. 

Responses to Background Questions. 1970-71 
(N = 7,131) 

A At what point are you in your studies? 

4^0 (1} f am in or have just completed my junior year of under- 
graduate study. 
64% {2) k am an undergraduate senior 

i6°o (3) J have a bachelor s degree but am not presently enrolled 
in graduate schooJ. 
(4) I am in or have ju^completed my first year of graduate 
study. \ 
6°c (5) f am in or have cOmp]e(ed rny second year of graduate 
study. y 

B What gradu^t^egreedoyOu intend to seek? 

5% (1) t do nciplan to pursue graduate study. 

(2) I plan re— fhjf^iue graduate work but not to obtain a 
graduatedegrde, 
38°o ^p) I plan to obtain teiVninat M,A.. M,S,. or other degree at the 

/'^ ^^aster^s level. \ 
^4"o (4) f p}^ to Obtain M.yJ, M,S,, or other master's fevef degree 

ieaHmg to a docttrai degree, 
28°o (5) J plan t^^qgtajrixi'h.D,. Ed.D.. or other degree at the doc- 
toral feveL 

C. Which of the following best describes your reason for taking 
this e;<aminat»on? 

9*'o f1) It is required to qualify for a National Science Foundation 
FeMowship. 

44% (2) It is required m applying for a graduate school fellowship 
Of assistantship. 

17^o (3) It IS required for continuing graduate study at my institu- 
tion. 

(4) It is required for earning an undergraduate degree at n-sy 
mstitution. 
21% (5) Other 

0 How many semester (quarter) courses m mathematics have 
you taken above the levef of precalcufus mathematics^ 

24% (tj Eight or fewer 

2e°^o (2) Ninetoeleven 

21% {3) Twelve to fifteen 

10°o (4) Sixteen to twenty 

15°* (5) Twenty*one or more 

E. in Which of the tollOwmg areas of mathematics has your 
study been nost concentrated? 

43*o (1) BeL' analysis 

3°o (2) Complex analysis 
37% (3) Afgebra 
{4) Topology 

3^ fS) Geometry 



37 

ERIC 



p. In what ZT^^ other than mathematics was your uncer- 
graduate prepii^ ration strongest? 

15% (1) Compuier sci^nc^ 

24% (2) Physics 

11% (3) A natural science other than Physics 

6% (4^ Philosophy 

41% (5) Dthef 

G. How many semesier fquarier) courses have you taken in 
computer science? 

36% (1J NO undergradgaleorgracfudia 

44% (2) One or two undergraduate: no graduate 

13% (3) More than two undergraduate; no graduate 

4% (4) Oneortwograduate 

2% (5) Throe or more graduate 

K What is the total number of mathematics semester (quarter) 
courses if^ which you were tested primarily on your ability to 
write cut proofs'' 

"iT/^ (1J None 

14% (2) One 

W* (3) Two 

26*/b (4) Three or (our 

2^/o (5) Pive Or more 

I. What ts the total number of mathematics semester (quarter) 
courses in which you were tested primarily on your abUity to 
solve problems in the sense of obtaining a nurriericdJ answer 
toa problem? 

15% (1) Two orfewer 

2S''^> (2) Three or four 

24% (3) Piveorsix 

27% (4) Seven or more but not all 

5% (5} Aii 



Table 15: Validities Using the Criterion of Attainment 
of the Doctorate for 845 National Science Foundation 
Fellowship Applicants in Flathemattcs in 1958-61. 
Split into Two Random Halves 





n 


= 423 


/ n 


= 422 


PffldtCtOfS 


r-bi»ridl 
Corrfilation 
w>th 
Cntsnon 


Rflrtormancfl 


r^bi4Ari«l 
CDrr«lal]On 

Criterion 


PrftJitlOf 


St^nO^rd 


ST«ncJard 
Miin D^viATion 


ORE A(Jvanc«J 












3a 


65 93 15 39 


^'t 


&4 93 15 94 


GRE Vflrodi 










AbrlLty* 


27 


62 as 10 96 


32 


£2 63 11 33 


ORE OuanliUtivQ 










AbNity* 


27 


72 67 9 51 


26 


71 54 10 14 


UGPA* 


2\ 


252 60 40 22 


24 


246 77 43. 13 


R«f«rlr>c« 












23 


42 60 9 38 


27 


42. S9 9 69 



*Scai*(} sCOr* wilh tn^rd ahg»tdroPP«0 

^□ur-pomMuls multipi»^ by '00 
"Zflro to SmgltipliOd by 10 



validity Data 

1. A group studied by Rock (1972) was composed of 845 appii- 
canis for National diQience f^oundation feMowships. Most applied 
for first*year NSP fellowships 1956-61. The prediclors were 
scores on the QRE Advanced Mathematics Tesi and the GRE Ap- 
l^lude Test» undergraduate grade-po^nt average* and an average 
rating of. reference letters. The criterion was attainment of Ihe doc* 
torate by June 1968, The group was split into random halves: the 
validity coefficients for each half are shown in Table 15. 

2. The group involved In an earlier study by Johnson and 
Thompson (1962) was composed of a small number of graduate 
studefits rn malhematics at Sacramento State College. The correla* 
tion between the predictor of GRE Advanced Mathematics Vest 
score and ihe criterion of grade-point average for alt graduate study 
was .76. This coefficient was significant at the .05Jev6L 

a Roberts (1970) studied the records of 37 students who had 
enrolled al Wake f^oresl University ffom June 1964 to June 1970 for 
graduaie study in mathematics and who had completed ai least 
nine hours of graduate work. The correlations betwaen graduate 
grade-point averages ard GRE scores were * for verbal abitit>» .55 
for quantitative ability* and .47 for Advanced Mathematics. 

4. The subjects for a study by Creager (1965) were 260 male appli- 
canis for Nafional Science Foundation fellowships in 1955 and 
1966. The predictors were scores on the GRE Aplituda Test (verbal 
and quantitative) and the Advanced Mathematics Test. One cri- 
lerion was time lapse between attainment oi the B.A. and the Ph.D.n 
coded as shown below: 

B.A.-Ph.D. Time Less No. Ph.D. by 

Lapse (in years): Ihan4 4 5 6 7 8 9 Aug. 64 
Coded Variable: 1 2 3 4 5 6 7 6 

A second criterion was the dichotomous variable of attaining or not 
attaining a Ph.D. by August 1964. The third criterion was the 
dichotomoL*s variable of attaininO or not attaining a Ph.D. in the 
average time taken to attain a Ph.D. in the lield. The relationships 
belween predictors and criteria are shown in Table 1^- 



Table 16: Validities of GRE against Doctorate Attainment 
for 250 Males Who Were Applicants for National Science 
Foundation Fellowships in Mathematics in 1955 and 1956 



Predictors 




RAritact«d B A .Ph.O. 


^ D by 19<>4 


PTh.C in* 


Po>nl 
Biterifil 




PDinl 
StMnal 


BiHrifll 


ORE varOAl Abtl^ty 


2\ 


22 


.30 


.19 


27 




.25 


£6 


.36 


.23 




GRE Advlncstf 














36 


04 


47 


.34 




Composite 


.36 


.35 


46 


.34 


46 



^Corr«iabOn$ beiw4«n the co<itni variot>1» S A .Ph 0 Ume iaP$» giv»n ubove 



es 



ERLC 



ADVANCED MUSIC TEST 



Content 

Questions in the test are drawn from the courses of study mosi 
commonly offered. Examination of curricuium ofterings reveals 
tttat two major areas of study~-music theory and music history- 
constitute the core ot most undergraduate music programs. These 
two areas provide the content focus of the Advanced Music Test. 
About 40 percent of the questions deal with music theory: the re- 
mainder deal with music history and literature. Since a number of 
the questions relate to style analysis, which may combine aspects 
Of both history and theory^ the percentages are approximations 
Only. 

Approximately three-fourths of the theory questions deal with 
traditional theory and about One^fourth with contemporary theory. 
Aspects of theory covered in the test range from such fundamental 
concepts as scales. intervalSn and key signatures to concepts re- 
lated to jazz and contemporary composition. Other examples of 
topics in music theory include cadences^ canOn. lugue. modes, 
rhythmic devices, principles ol instrumentation and orchestration, 
quartal harmony, polychordsn serial music, and electronic music. 

Questions dealing with music history and literature cover four 
historical periods^MedrevaJ -Renaissance. Baroque. Class;caJ- 
Romentic^lmpressionistk:. and Twentieth Century. The Questions 
are relatively eyenly divided among the four periods. 

A number of cognitive abilities are measured in the test. Ap- 
proximately one third of the que^ticfis in the test are devoted to 
style analysis^ These questions are intended to test ability to inter- 
relate lacets of musical knowledge, such as styles, composers, and 
historical periods, and abitity to analyze musicar passages, includ- 
ing score reading— with application of appropriate principles of 
theory, harmony, and instrumentation. The remaining two-thrrds of 
the questions measure familiarity with basic musical terminology, 
concepts, and principles: ability to read and interpret musical nota- 
tion; and ability to identify^ Irom written musical notation, composi- 
tions and such musical elements as intervals and scales^ 

Listening skills — which are not tested — are. of course, basic to 
the study of music. An experimental listening test was administered 
and its vaiidiry studied in the early'l960's/However» the limitations 
of a testing program that must accommodate the needsof a variety 
of students have prevented the inclusion of such a listening 
measure. II should be noted that correlations between the experi- 
mental listening test and the written test are quite high^ 



Responses to Background Questions. 1970-71 
(N = 2,503) 

A. At what point are yo^J in yourstudies? 

2% {1) ' am in or have just completed my junior year of under- 
graduate study^ 
53% (2) I am an undergraduate senior. 

24% (3) I have a bachelor's degree but am not presently er>roMed 

ingraduate school. 
12% (4) I am in or have just compfeted my first year of graduate 
study. 

6% (5) I am in or have completed my second year ot graduate 
stiidy^ 



B. What graduate degree do you intend to seek? 

2% (1) I do not plan to pursue graduate study, 

3^^ {2) I plan to pursue graduate work but not to obtain a 

graduate degree. 
53*^^0 (3) i plan to obtain terminal M.A.. M.S.. or other degree at the 

master's level. 

27% (4) I plan to obtain M.A.. M.S.. or other master's level degree 
leading to a doctoral degree. 

13^0 (5j \ plan to obtain Ph.D.. Ed.D.. or other degree at the doc- 
toral level. 

C. In which of the following areas is your music specialty? 

43% (1) Music education 

36% (2) Applied music {performance) 

6% (3) Music history and Mterature-musicology 

9% (4) Music theory-composition 

4% (5) Other 

D. il you have a declared major instrument, within which ol the 
following group does it lall? 

24% (1) Voice 

37% (2) PianO*organ 

6^^ (3) Strings 

26% {4) Woodwinds^ brass 

2% (5) Percussion 

E. Which ot the following most accurately identifies the type of 
institution in which your training is currently being received? 

21% {1) Music department in a teacher education institution 

51% 42) Music department in a liberal arts institution 

12% (3) Music school or fine arts school 

7% (4) Music conservatory 

5% (5) Other 

F. How many semester (quarter) courses have you had 
specifically in ear training and sight singing? 

22% 0) None 

10% (2) One 

16% (3) Two - 

12% (4) Three 

35% (5) Four or more 

G. How many semester (quarter) courses have you had that 
pertain specifically to music theory? 

1% (1) None 

2% (2) One 

7% (3) Two 

10% (4) Three 

79% (5) Four or more 

H. How many semester (quarter) courses have you had that 
pertain specifically to Orchestration and instrumentation? 

40% (1) None 

35% (2) One 

16%. (3) Two 

4% (4) Three 

4% (5) Four Or more 



I. How many semester {quarter) courses have you had that 
pertain specificallv to music history and literarure? 



1% 


(1) 


None 




(2) 


One 


23% 


(3) 


Two 


21% 


H) 


Three 


47% 


(5) 


Four Or more 



ADVANCED PHILOSOPHY TEST 



Content. 

The members of the committee responsible far rhe test recognize 
that the subject matrer of philosophy is itself a topic of philo- 
sophical debate and that the subject can be taught in many dif- 
f erent ways. Therefore, the test questions cover a wifJe range of nrta* 
teriat and approaches to philosophy so as io reflect fhe diversity of 
currlculurnsand student preparation. 

A principal aim of the examination is to test the student's under* 
standing^ To this end questions have been devised that will favor 
the student who has read both widely and criiicaily in philosophy 
and questions avoided that can be answered by a student who has 
chosen to rely on outlines and summaries in preparing for the 
examination^ Thus^ although some of the questions can be 
answered by drawing on purely factual information, the emphasis is 
on analysts, interpretation, and reasoning. 

Most of the f^st rs devoted to figures and problems rn Western 
philosophy, with only an occasional question orrOrientai systems 
of thought. About one-third ot the quest ions, Assess philosophic 
reasoning and applfCation of logicaJ principJes. The greatest his* 
toricat emphasis is given to the period between 1600 and but 
the student is expected to show some knowledge ot the ancient, 
medieval, and contemporary periods as welt. The rnedievai period 
receives relatively little emphasis. 

Various kinds of questions are used (o test the stu<1ent'^ 
competence in the broad areas of ethics, social and political 
phHosophy. logic and philosophy of language, metaphysics and 
philosophy of mind, and epistemoiogy. The fields ot philosophy of 
science, aesthetics, philosophy of religion, and philosophy of his^ 
tory receive lesser emphasis. Some questions require students to 
demonstrate their ability to grasp the implications of ideas; others 
require them to identity important ideas of particular phHosophers. 
solve problems in togic. or choose approphate definitions of philo- 
sophical terms. Another type of question requires students to read 
a passage, the source ot which mayormay not be identified, and to 
choose an answer on the basfs of their rnterpretatron of the evr* 
dence suppMed in the text or by applying what they have learned 
about the author ol thepeasage^ 

There has bean Jjtlie attempt to include current moral and social 
prot^iems as subject matter for philosophic discussion. This 
omission should not be construed as a judgment on the validity of 
approachrng undergraduate philosophy through considering con^ 
temporary issues. Rather, it refracts the ditficulties of anticipating 
changes m current concerns and of formulating questions in such a 
way that the students' own views would not impede their reading or 
answering correctly. 

90 



The establishment of subscores for factuaf information and philo* 
sophic reasoning is currently being planned. 

Responses to Bsckground Questlonsi 1970-71 

A. At what point are you in your studies? 

3% (1) I am in or have Just completed my junior year of under* 
graduate study. 
65% (2) I am an undergraduate senior. 

16% {3) I have a bachelor^s degree but am not presently enrolled 

in graduate school. 
6% {4) I am in or have just completed my first year ot graduate 
study. 

5% (5) I am in or have completed my second year of graduate 
study. 

B. What graduate degree do you intend to seek? 

3% {1) I do not plan to pursue graduate study^ 

3% (2) I plan to pursue graduate work but wt to obtain a 

graduate degree. 
1 5% (3) I plan to obtain terminaJ M. A. . M.S.. or other degree a! the 

master's level. 

26% I plan to obtain M.A.. M.S.. or other master's (evel degree 
leading to a doctoral degree. 
{5) t plan to obtain Ph.D.. Ed.D.. or other degree at the doc 
toral level. 

C. Which of the fotfowing most accurately describes your 
preparation in philosophy? 

22% (1) Philosophy major with honors or independent or ad* 

vanced study 
59% (2) Phnosophy major 

6% {3) Course work equivalent to that of a philosophy ma)or 

9% i^) Philosophy minor 

2% {5) OnecourseOrnone 

D. If you are not a philosophy major, what is (was) your under' 
graduate major? 

4% {1) A natural science or mathematics 
3% (2) A social science (including psychology) 
4% (3) A language or a literature or one of the arts 
4% (4) History, politics, or government (including interdisci- 
plinary area studies) 
5% (5) Other 



E. Which Of th« following best describes the'matertal of the in- 
troductory course in philosophy that you took? 

41% (1) Typicai problems from different branches and periods ot 
philosophy 

23% (2) The doctrines of a very few major philosophers 
6% (3) One period in the history of philosophy 
11% (4) One of the areas of philosophy 

12% (5) A systematic e^tposition of philosophical positions or 
schools 

Fh How many semester {quarter) courses have you had in logic? 
17% (1) None 

27% (2) One onty. in traditional (nonsymbolic) logic 
26% (3) One only, in symbolic logic 
9% (4) One onfy. covering traditional logic and some scientific 
method 

19% (5) At least two. including both traditional and symbolic 
logic 

Q* HOW many semester (quarter) courses have you had *^ the 
history of philosophy? 

9% (1) None 

tO% (2} Oneoniy. in ancienf phiiosophy 
6% {3) Oneoniy. m modem philosophy 

37% (4) A fufi year« covering ancient, medieval and modern 
philosophy 

35% {5) Afullyear or more, including contemporary philosophy 

H. Which of the foltowiiig best describes your exposure to 
Oriental systems of thought? 

42% 0) No course work arKf no independent readme 
29% {2) independent reading only 

12% (3) Courses offering philosophy credit plus independent 
reading 

3% {4) Courses not offering philosophy credit plus independent 
reading 

13% {5) Course(s)in comparative retigion or world religions 



L In wh^ch of the following areas are you LEAST prepared to 
an s wer ciu est i on s ? 

16% 0) Social ^hd political philosophy 

34% {2j Aesthetrc^ 

35% {3) Philosophy of science 

3% {4) Metaphysics 

10% {5) Philosophy of religion 



Philosophy Validity Data 

Lannholm. Marco, and Schrader ( t96Q) reported on a group com- 
posed of 42 students first enrolled in a particutar graduate depart- 
ment of philosophy between the fall of 1957 and June 1960. The 
predictors were scores on the GRE Advanced Phiiosophy Test and 
theGRE Aptitude Test* The criterion was success in graduate study, 
defined as having earned the Ph.D. or being still enrolled and rated 
by facufty members as outstanding or superior in the faM of 1963. 
The results are reported in Table 17. 



Table 1 7: validities for 42 Philosophy Students 







Mean 






Corr»liUon 


P«rtOrm»nc0 


0«vli1ion ot 




Witt) 




Performance on 








Predictor 


ORE ^vAnCMJ pTiMOiOfihr 


.11 


779 


101 


GREv«rtiil Ability 


.17 


70e 


73 


GRC Ou«ntJttttiv« Abditv 


.44 


646 


92 



ERIC 



ADVANCED PHYSICS TEST 



Content 

Th« purpose of the Imt is to assess the students' understanding of 
fundamental principles and Iheir ability to aPpiy these principles in 
the solution of problems. 
The approximate percentages of questions on content topics are 



as follows: 

PERCENTAGE 

TOPIC OF QUESTIONS 

r Ctasejcat mechanics. incj*iding Lagrangfan 

and Hamiltonian formulation 18 

2. Fundamentalsofelectromagnetism. 

Including Maxweirs equations 16 

3. Atomic physics 15 
4 Physical optics and wave Phenomena 10 
5. Quantum mechanics 10 
8. Special relativity 7 

7. Thermodynamics and statistical mechanics 7 

8. Laboratory methods 5 

9. Muclear and particFe physics 4 

10. Solid state physics ^ 

11. Miscellaneous 2 



Raaponaaa to Backgroond Qoastlons» 1970-71 
(N = 3.907) 

A. At what point are you in your studies? 

3%. 01 ^ ^^^^ i^^' completed my Junior year of under- 

graduate study. 
67% (2) I am an undergraduatesenior. 

12% (3) i have a bachelor s degree but am not Presently enrolled 

in graduate school 
9% {4) t am \n or have 1^^* completed my first year of grar[^ate 
study. 

7% {S) \ am in or have completed my second year of graduate 
study. 

B. What graduate deg'ee do you intend to seek? 

2% (1) I do not plan to pursue graduate study^ 
1% (2) \ pfan to pursue graduate work but not .to obtain a 
graduate degree. 
14% (3) \ pUn to obtain termmalM-A.. M.S^.orotherdegreeatthe 
master's level. 

25% (4) f plan to obtain M.A.. M.S,. or other msster^s level degree 

leading to a doctoral degree. 
57% (5) I plan to obtain Ph.D.. Ed.D.. or other degree at thedoc 

torai level. 

C. In which of the following fields did you major as an under^ 
graduate? 

86% (1) Physics 

4% (2) Mathematics 

5% {3) Engineering 

1% {4) Chemistry 

4"/* (5) Other 



D. With respect to graduate schools, what isyour reason for tak- 
ing this test? 

tG% { \y To gain admission to a graduate School only 
e% {2) To secure financial assistance from a graduate school 
only 

70% (3) To gain admission to graduate school and to secure 
f i na n ciai as s i stan ce 

E. Areyou taking this test in order to secure financial assistance 
from the National Science Foundation? 

26% (1) Yes 
69% (2) No 

Validity Data 

r The group involved in a study by Michels (1966) was composed 
of 72 students who entered graduate institution A and 52 who 
enters institution B in the fall of 1962. The predictors were scores 
on the GRE Advanced Physics Test and the ORE Aptitude Test 
(verbal and quantitative). The criteria were a faculty ranking index» 
RIl, Of the students' performance based largely on grades in the 
first year of graduate study and a faculty ranking index, Rl,, based 
on performance in the first three years of graduate school Forinstl* 
tution A, which admitted 24 students with Advanced Physics Test 
scores below 600. the relatlcns between predictors and criteria 
were strong, but for institution B. which admitted practically no 
students with Advanced Physics Test scores below 600. the rela- 
tions between predictors and criteria were relatively weak. The 
probabilities of finding the relations observed in the absence of any 
actual correlation between the predictors and criteria are shown tri 
Table 1d. 



tablalS: Probabilities of Rnding Relations Observed 
between Prodlctora and Criteria In Physics 
for Inst^totlon A and B If the Correlations 
between Predictors end Criteria were Zero 







PrpbotiHltv 


PrwJiCtor 


Criterion 


Iftslitutipn A 


instttution a 


ORE Advanced Pt^ys^ca 


Ri, 


.023 


.81 


GRE V«rbalAMlitv 


Rf. 


.31 


50 


ORE QuantlUtivvAbttity 


Ri, 


.02^ 


.62 


Sum Of Advanced PtiV^ic* and 








QuantitalM^ Abilily S£or«3 


Rl. 


<.0Ol 


.82 


ORE Advanced Ptiy»ic» 


RI. 


-.001 


.€2 



tt can be concluded that tha Advanced Physics and quantitative 
ability scores are useftit in distingtiisiiing between outstanding and 
poor physics graduate students, but not very useful for distinguish- 
ir^ among vario^ levels of outstanding students. 

2 The group included in a study by Voorhees (1960) was com- 
posed of 68 graduate students admitted to the Department of 
Physics of the University of Chicago from the fall of 1950 through 
the fall of 1966. The predictor was the GRE Advanced Physics Test 
score and the criterion was success >n graduate study. Successful 
students were those who obtained the Pti.D. or had passed the 



92 



100 



Ph.O* candidacy examinaiiorir Approximately 92 p6rceni of ihose 
with scores of 600 or above on ihe Advanced Physics Teist were in 
the s(Jccee$fol group. Only 53 percent of those w»th scores below 
600 were soccessfoL 

3. The groop included in a siody by Lannhoim. Marco, and 
Schrader (1966) was composed of 39 students first enrolled in a 
particular graduate department ol physics betwoen the fail of 1957 
and Jone 1960. The predictors were scores on the GRE Advanced 
PhysicaTedt and the GRE Aptitude Teat. The criterion was succees 
in gradoata study, defined as having earned the Ph.D. or being stiJ} 
snrolled and rated by lacoity members as outstanding or superior 
In ihefaJI of 1963. The resoiis are reported in Table 19. Some of Ihe 
reasons for unexpectedly low or negative corretalions between 
teats and performance are explained in Chapter 6. 



Table 19: Validities for 39 Phyalcs Stodenta 









StanddrO 




Cor' stall on 




O«viationol 




wilh 


on 


PdHOrmanCe on 


Prvdictar 


Cntarion 


Predictor 


Prsdiclor 


ORE A(tvAri(^ Physics 


03 


603 


107 


^ OAE Vvrijal Ability 


27 




96 


GRE Ouitatktauvtt Abiti^V 


01 




81 











4. The group in another study by Lannholm et at. (1968) was 
composed of 36 students first enroNed rn a particufar graduate de- 
partment of ptiysics between the fall of 1957 and June i960. The 
predictors were scores on ttie GRE Advanced Physics Test and 
ondergradoate grade^point average. The Criterion was success ^n 
gradoaie study, defined as having earned' the Ph.D^ or being still 
enrolled and rated by faculty members as outstandirtg or superior 
in the tall of 1963. The results are reported in Table 20. 



Table 20: Validities for 38 Physics Students 





r.bi^enal 


Mean 


3tafldard 




COrrofaiiOr^ 


PorfOrmdncfl ' 


Oavidhort cl 




with 


on 


PBiior/n^ncPoa 


Pt*dtctor 


Critttnon 


ProdtctOr 


Pr«<Hc(Or 


ORE Mvanc«tf Phystcs 


08 


621 


J21 


00 PA 


40 


3 13 


*5 



5^ Roberts (i970) studied the records o* 27 stodenls who had 
enroNed at Wake Forest Univ^rsrly from June 1964 to June 1970 for 
graduate stody in physics and who had completed at least nine 
hoors of graduate work. The correlations between graduate grada^ 
point averages and GRE test scores were -.50 for verbal ability. 
-hOI for quantitative ability, and ,30 for Advanced Physics. 

6. Thesubjecis for a study by Creamer ^1965) were 600mareapptf- 
cants for National Science Foundation fellowships in i955 and 
1956. The predictors were scores on the QBE Aptitude Test {verbal 
and qoontitative) and the Advanced Physics Test. One criterion was 
time lapse between attainment of the BA. and the Ph.D.. coded as 
shown below: 



B.A.^Ph.D, Time 
Lapse (in years); 
Coded Variable: 



Less 
than 4 

* 1 



No Ph.D. by 
Aug. '64 

a 



A second criterion was thedichotomous variable of attaining or not 
attaining a Ph.D. by Aogust The third criterion was the 

dichotomous variable of attaining or not attaining a Ph.D. in the 
average time taKetr to attain a Ph.D. in the ffafd. The reiationships 
between predictors and criteria are shown in Table 2 1. 



Table 21 1 Valtdtties of GRE against Doctorate Attainment 
for 600 Males Who Were Applicants for National Science 
Foundation Fellowships in Physics In 1955 and 1956 



Pretfictors 


Crit«ha 


Re1l*Ct«d e.A,.Ph,0. 


Ph.D. by 


Ph.O.'n 


Point 


POtnt 
BiMnal BlMriit 


GRE Vtrtial AbiUty 


IS 


.15 . .19 


.12 IB 




.26 


.26 .33 


.23 .30 


GRBJ^Ancad Pliysic* 


.34 


.32 .31 


.30 * .3fi 




.35 


.33 .<3 


,33 .42 



Ihe prtd^ctorg with tti« ^9*^ r«vef9«d 



10. 



93 



ERIC 



ADVANCED POLITICAL SCIENCE TEST 



Content 

In preparing the test. divef3ity of curriculums and backgrounds of 
students are taken into account. The questions are drawn from the 
courses of study most commonly offered. 

The distribution of questions among aubfretds of the discipJine. 
and within each subfie1d> according lo skills, processes^ and ap- 
proaches is reviewed yearly by the commiltee of examiners. To 
faciHtate the process of allocating questions, the committee has 
developed the foliowing test specifications. The specifications are 
an Sppfoxtmation of the content breakdoiVn. but the celi 
percentages do not serve as ri^id guidelinas for the selection of 
question^. 



Spedflcatlone for the ORE 
Advanced Political Science Test 





ln(«r- 
natkonai 


Pdimcai 
Sy^tAms 


AmArtCln 




TOTAL 






2^ 


2 5% 


25% 


10 0% 


Pgimci And PoUtic*! 
L*gill«ttv« 
Ad(n4nl9tr«liwfl 

Voting 
AttiludM 


7.5% 


12.5% 


17 5^1 


50% 


42 5% 


GovVfnm«nl Structure; 
Organ tzAQon 










125% 


Th«orv mnO Appr&acnM 


1.S% 




2.0% 


50% 


10.0% 


M«thodo(ogv 


1 0% 




10% 


12.0% 


15.0% 


TOTAL 


15.0% 


22 5% 




245% 


90.0% 


Tiiougr^l 










10.0% 


TOTAL 










100 0% 



Although specific information is needed to answer many of the 
questions, most questions are not f imited solely to recafl. 



Responses to Background Questions, 1970-71 
(N= 5,314) 

A. At what point are you in your studies? 

3% (1) I am in or have just completed my junior year of under- 
graduete study. 
62% (2) lamanundergraduatasenior. 

21% (3) I heve a bachelor's degree but am not presently enrolled 

in graduate school. 
8% (4) I am in or have just completed my first year of graduate 
siudy. 

5% (5) i am in or have completed my second year of graduate 
study. 



B. What graduate degree do you intend to seek? 

3% 0) I do not pJan to pursue graduale Study. 

2% (2) I plan to pursue graduate work but nol lo obtain a 

graduatede^ree. ^ 
39% (3) I plan to obtain terminal M. A., M.S., or other degree at the 

master's level. 

30?^ {4) I plan to obtain M.A.. M.S.. or other master's level degree 
leading to a doctoral degree. 

25% (5) I plan to obtain Ph.D.. Ed.D.. or other degree at the doc- 
toral level. 

C What is {was) your undergraduate major? 

8Q^^ (1) Political science 

7% (2) Other social science 

1% (3) Mathematics or a natural science 

6% (4) Historyorotherhumanilies 

6% (5) Other 

D. What is {was) your undergraduate minor? 

7% (1) Political science 

19% (2) Other social science 

2% {3) Mathematics or a natural science 

26% (4) History or other humanities 

41% (5) Other or not applicable 

How woiild you classify the distribution of your course work 
in political science? 

14% (1) Highly concentrated in one field 

2e% (2) Directed toward one field 

20% {3) Fairiyevenlydivided between twomajorfietds 

13% (4) Distributed among three frefds 

25% {5) Distributed equaliyacrossthediscipline 

R In which of the following fields did you concentrate your 
undergraduate work in poHtlcal science? 

34% { 1) American government and poJitrcs (including public faw) 

5% (2) Urban affairs 

15% (3) Comparativegovernment and politics 

26% {4) Internetronal relations 

12% {5) Political theory {normative) 

G. If you were given the opportunity to select one of the follow- 
ing tests or methods of reporting scores for graduate school 
admission, which would you prefer? 

9% (1) The current Advanced Politica] Science Test covering all 

fjelds. with a single score reported 
19% {2) The current examination, with separate scores reporled 

f^r each major fietd convered on the test 
21% (3) A shortened version of the present test plus your choice 
of one of three or four optional fietd tesls. with two 
separata scores reported 
36% (4) Your choice of two of the following field tests with two 
, scores reported: American Government and Politics, 
Comparalive Government and Politics. Internationaf 
■ Reiat i one. N or mati ve Pol iti cal Theory 
12% (5) Other - 



94 

ERLC 



lo:: 



H. Hftveyou taken a political science melhodology and/or statis* 
* tics course? 

61% (1) I have taken neither a methodology a statratics 
course. 

1G% {2) I havetakenamethodologvcourseonly. 
10% (3) ) havetakenastatisttcscourseoniv. 
5% (4) i havcv taken one course combining rnethodofogy and 
stBttstics. 

5% (5) i havetaken both a methodology and a statistics course. 



I. Have you aver done independent research requiring the 
collection, processing, and interpretation of data? 



39% 
69% 



(1) Yes 

(2) No 



ADVANCED PSYCHOLOGY TEST 



Content 

The questions in the test are drawn from courses of study most 
commonly offered within the broadly defined field Of psychology. , 
Quastions in the test often require the student to identify 
psychologists associated with particular theories or conclusions 
and to recall information from psychology courses. In addition, 
some questions require analyzing relationships, applying prir^* 
ciples, drawing conclusions from experimental data, and evaluat* 
ing experiments. 

Although the test offers only two subscores. there are questions 
in three content categories, as follows: 

1. Experimental or natural science oriented; with questions dis* 
tributed about equally among learning, physiological and com- 
parative* and perception and sensory psychology. 

2. Social or aociat science oriented, with questions distributed 
about equally among personality* cMnical and abnormal, 
developmental* and social psych ology^ 

3. QeneraL including historical and applied psychology, measur'^ 
ment* and statistics. 

A separate subscore is reported for onty the first two of these 
categories. Evidence from students" performance or> test questions 
shows that questions within each of the two subscore categories 
are more closeJy related to each other lhan are questions in dif* 
ferent categories. Each of the subscores reported fOr the test is 
based on approximately 40 percent of the questions in the entire 
test, 

RMeerch Deleted to Additfonet Subscores on the 
Advanced PsychotogyTeat 

Two subscores. Experimental Psychology and Social Psychology, 
are currently reported fot the Advanced Psychology Test. The ORE 
Board has considered, however* tt^at ' it is both desirable and feasi* 
ble to report more detailed and useful part*score information on the 
basis of the Advanced Tests,'' The recommendation^ that the report- 
ing of additional subscores on the Advanced Tests be investigated 



stemmed from a feeJmg.on the part of both the Board and several of 
the committees of examiners that more than a single score should 
be produced from three hours of testing and that tests wou*4 be 
much more useful— partlcuiorly to students-^if tney could indicate 
strengths and weaknesses in the several subfields of each content 
area. A^so, the Board and the committees recognized that such sub* 
scores could be valuable for many counseling and placement deci* 
sions. 

However, in spite of the widespread agreement about the de- 
sirability of reporting as many subscores as possible, there are a 
number of areas of concern — including the reMabiffty and inde- 
pendence of the subscores— that center on tt>e eventual use of the 
subscores, tf the use of subscores were restricted to placement and 
guidance, the importance of these concerns would diminish, Piace- 
ment and counseling decisions are reversible* whereas admission 
decisions geriaraify are not; therefore* much fower standards of 
statistical adequacy^ would be acceptable if subscores were not 
used for admission decisions. This would enabte many more sub* 
scores to be reported, while allowing test committees to Qive 
continued emphasis to the various elements in their disciplines. 

A study was designed to investigafe the number of logically 
meaningful subscores that could be generated from the Advanced 
Psychology Test if the statistical standards for subscores u^ for 
admission purposes are relaxed. The study examined the reliability 
and independence of subscores based on the eight major content 
areas of the Advanced Psychology Test as an initial step in deter* 
mining the extent to which the propositions endorsed by the QRE 
Board were conceptually and psych ometricatly feasible for exten^ 
Sion to the design and administration of the QRE Advanced Tests, 

On the assumption that a subscore might be developed in each of 
the major contenf areas established by the committee as part of its 
test specifications, content analysis was used to define fhe struc* 
ture ol two forms of the QRE Advanced Psychology Test, These 
areas, each of which would provide valuable data for counseling 
graduate students, include: 0) Personality, {2i Learning* (3) Mea* 
surement* (4) Developmental. Psychology, (5) $oclal Psychology, 
{6) Physiological and Comparative Psychology* (7) Perception and 
Sensory Psychology and (6) CNnical and Abnormal Psychology. 



lo: 



95 



ERIC 



The resuUd of the study show that each of the eight subset: 'es ap- 
pears ro meet the crtteria of independence sat for subscores 1 1 ttie 
QRE Program. This cor^firms the vaNdity oJ the cOnonDirtea of 
exanDiners' t^elief thai subscores based on the e^ght content areas 
would be about as independent as the two subscores being 
reported at present. The commiftee feels that subscores based on 
the content areas woutd beb^^far memoatusefuf ofSes'for purposes 
of guidance and placement because the curriculum tends to be or- 
ganized m the same way. 

In addition, it appears that subscores based on the content areas 
defined by the committee have considerable potential to provide in- 
formation about students with unusually high or low scores for use 
m guidance and placement. 

Factor analysis was used as a second and less subjective method 
of examining the structure of the two formf of the GRE AdvantJed 
Psychology Test, and it offers an independent description of the ob- 
served data. 

Factor anaJytic techntQues were used to investigate two separate 
Questions: 

1. To what extent do the items in each content^jeter mined sub- 
score appear to be measuring a single general factor? 

2. Do other logically valid and potentially useful groupings (sub* 
scores} of the items exist \r\ the data ? 

The committee felt that the results of the facior analysis of the 
two test forma were basically consistent in identifying factors that 
measure primarily knowledge of Jacts. knowledge of theories, and 
povvers of interpretation and analysis. However, the committee did 
not believe that subscores based on me factor analysis wouJd be 
useful for guidance and placement. The point was made that the 
Curriculum is organized along content-determined tines, and that 
students need to know the^r strengths and weaknesses in those' 
terms^ The committee felt that the results of the factor analysis in no 
way limited the usefulness of subscores based on content areas^ 

The committee's content analysis is a logical way of relating iest 
content to the curriculum, and is intended to ensure that all the 
major areas of the curriculum are appropriately represented in the 
test. In view of the relative independence of ttie subscores based on 
content areas, the fact that each cont^t area did not emerge as a 
separate factor in the factor analysis was not disturbing. Such a 
result is consistent with the fact ttiat content areas are not learned 
independently, ttiat many require introductory courses covering all 
the major areas of psycho^ogy^ and that much psychofogfcaJ theory 
is applicable across content areas. 

Ttie future of such sut)score reporting depends, however, on 
findings for other Advanced Tests and significant changes in the 
GRE Program to report such subscores in a way that would prevent 
their use for other than counseling purposes. 



Responses to Background QuestionSi 1970-71 
{N = 17,578) 

A. At what point are vou in your studies? 

4% (1) I am in or have just completed my junior year of under- 
graduate study. 
65% (2) J am an undergraduate senior 

16% (3) I have a bachelor's degree but am not presently enrolled 
in graduate schoOL 
7% (4) I am in or have just completed my first year of graduate 
study. 

5% (5) I am ir> or have compietecJ my second year of graduate 
study. 

What graduate degree do vou intend to seek? 

2% (1) I dp not Plan to pursue graduate study. 
1% (2) 1 Plan to pursue gr^iduate work but not to obtain a 
graduate degree. 
23% (3^ I plan to obtain terminal M.A., M.S . or other degree at the 
master's level. 

27% (4} f plan to obtain M.A.. M.S.. or other master s level degree 
leading to a doctoral degree. 

45*^ (5) f plan to obtain Ph.D., Ed.D.. or other degree at the doc- 
toral level. * g 

C. If you are now a college senior, which of the following best 
describes your educational experience and your plana with 
respect tp the graduate study ol psychology? (If you are not a 
senior, mark 5.) 

50% (1) I am an undergraduate major in psychology and I Plan to 

do graduate work in psychology. 
10% (2) I am an undergraduate major in psychology and f Plan to 

do graduate work in some other field related t(^ 

psychology. 

2% (3} r am an undergraduate major in psychology and f pran to 
do graduate work in soni>e field not related to psychology^ 
4% (4) I am not an undergraduate major in psychology but I plan 
to do graduate work in psychology. 
27% (5) Other 

P. In wtiat general area woufd you classify your undergraduate 
major? 

79% {\S Social science 

5% (2) Biological science 
(3) Physical science 

1% (4^ Mathematics 
12^4 (5) Other 

What is (was) your undergraduate major? 

83% (1) Psychology 

1% (2) Philosophy * 

2% (3) Sociolt>gy 

2% (4) Education 

11% (5) Ottier 



ERIC 



F. In what area of psychologv you had the mast course 
work? 

31% <1)^CItnical or abnormal 
8% (2| Educational 
29% <3) Experimental 
13% (4) Social * 
18% (5f Other 

G. Which of the following beat describes your work Id these 
three courses: general (Or introductory) psychology, experi- 
omental psychology /and statistics? 

14% (1) Gen^raJ psychology only 

1% (2) Experlmentalpsychologyonty 
10% (3) Generai psychology and experimental psychology only 
64% ("4) General psychologyt experimental psychology, and 

statislios 
10% (S?f Other 

H. How recently have.you had a college or graduate course in 
psychology? 

77% (1) During thecurrentacademicyear 

13% (2) During the previous academic year 

5% (3) Twoorthreeyearsago 

2% (4) Four Or five years ago 

2% {S?l Other 

L iri what area of psychology, it any. do you plan to pursue your 
career? 



44% 

13% 
11% 
10% 
20% 



<1) CHnicat or abnormal 



(2) 
<3) 
<4) 



EducafionaJ 

Experimental 

Social 



(S^ Other, ornot in psychology 



VaJJdIty Data 

1. Rock (1972) reported on a group composed of 778 appilcants 
for National Science Foundation feilowships. Most applied for NSF 
fellowships in 1958-61, The predictors werescoresontheQRE Ad- 
vanced Psychology Test and the GRE Aptitude Test {verbal and 
quantitative), undergraduate grade^point average and an average 
rating of reference letters. The criterion was attainment of the doc- 
torate by June 1968. The group was split into two random halves; 
thevalidity coefficients for each half are shown in Table 22< 

2. * A group reported on by Lorge (I960) was composed of 165 
graduate students majoring in Psychological Foundations ot 

Education at Teachers CoUeget Columbia University, Using the 
score Obtained on thedoctoral written examination as the criteriont 
correJations of ^41 with the GRE Advanced Psychology Test scoret 
.63 with the GRE verbal ability score, and .32 with the GRE quantita- 
tive^bility score were found. 

3. The group involved inastudy bySistrunk(1961) was composed 
of 73 graduate students in psychology. at the University of Miami. 
The correlations between the predictor (GRE Advanced Psychology 
"nSTscore) and the criterion (department examination) was .56. 

4. The group included in a study by Lannholm* Marco, and 
Schrader (1966) was composed of 47 students f\rst enrolled in a 



Table 22: Veliditles Using the Criterion of Attainment 
of the Doctorate for 778 National Science Foundation 
Fellowship AppMcanti In Psychology in 1958-61, 
Split Into Two Random Halves 





f 


= 360 


f1 




PredKton 


r-Ok»nBl 
Corr«lalion 


Predictor 
Pvriormsn c# 


r^biMhai 
COrrOiatton 


P/ediclor 
Pttfformanctt 


with 
Cflt«Tton 


StAftdard 
Ma«n Owkilidn 


With 
ChtlrlOn 


Stindard 
M«n Dwltlion 


ORE Adyan«ed 












.19 


00.90 


.24 


60.07 9,05 


GRE V«Tti«l 












,12 




.19 


03.47 ».S9 


GRE OuarititAUv« 












.33 


sa.es 11.34 


.14 


00.90 I0.n2 


UGPA' 


.02 


341,70* ^0.10* 


.02 


236.76 42.M 


R»t«f«n<» 












.10 


43.86 B.30 


.14 


43.63 6.49 



■Sc*lM scora with third digil dropped 

t f Our*poir^t sen! v multiplied Oy 100 
an - 402 

*Zero to 6 multiplied by 10 



particular graduate department of psychology between the fall of 
1957 and slur* I960. The predictors were scores on the GRE Ad- 
vanced Psychology Test and the GRE Aptitude Test. The criterion 
was success iri graduate studyt defined as having earned the Ph.D. 
or being still enrolled end rated by faculty members as outstanding 
or superior in the fall of 1963 The results are shown in TabJe23. 



Table 23: Validities for 47 Psychology Students 









StAn^rd 




r*oi»riBt 




Dtviatkin of 




COrrvlttion 


P«rlorm«nc« 






with 


on 


pn 


Pr«d^tor 


Cntarkon 


Predktdr 


Predictor 


ORE Advincad Prychoiooy 


.35 


685 


49 


GRE VttfbalAl^llttv 


.26 


705 


62 


ORE Quaotitaliv* AbMity 


.27 


670 


ai 



5. The group involved in another study by Lannhplm et a1. (1^68) 
was composed of 38 students first enrolled in a particular greduate 
department of psychology between the faN of 1^57 and June 1960. 
The predictors were scores or> the GRE Advanced Psychology Test 
and the GRE Aptitude Test. The criterion was success in graduate 
study, defined as having earned the Ph.D. or being still enrolled and 
rated by faculty members as outstanding or superior in the fall of 
1963. The results are stiown in Table 24. 



Table 24: Validities for 38 Psychology Students 









Standard 




r*t)l«arkal 




Dsvitttion 












with 




on 


Predator 


CrItiNon 


Pr«dtc1or 


Pr«<fietor 


GRE Advinced Psychology 


-,11 




65 


ORE V»rtal Atomty 




eas 


6S 


GRE Qulnltt»tiv«AMtitv 


IS 


608 


99 



/ 



97 



EI\IC 



f 

6. A third study by Lannholm et at. {I968) involved a group c6m- 
poaed of 20 student? first enrolled in a particular graduate depart- 
ment of psychology between the fall of 1957 and June i960. The 
predictors were scores on the GRE Advanced Psychology Te$t and 
theGRE Aptitute Test and undergraduate grade-point average. The 
crUerion.was success in graduate study, defined as having earned 
the Ph.D. or befng stiN enroued and rated by facuJty members as 
outstanding or superior in the fall of 1963. The results are shown in 
Table 25. 



Table 25: Validities for 26 Paychology Students 









Sta'TdAi'd 




r-toiHriil 


Maan 


Deviation dl 






ParfOrmanC« 


P«rtormdncO 






on 


on 


Predictor 


Cr^tarioa 




PrfrdiClOr 


GRE A<Mnc«d ^ychof^y 


29 


607 


75 


QBE V«rtil Abtltty 


27 




90 


QRE Quantit«1iv« Ability 


45 


518 


96 


UO^A 


\\ 


296 


.44 



7. A fourth group studied by LannftoJmfltsl. (1968) was composed 
of 26 students first enrolled in a particular graduate department of 
psychology between the fall of 1957 and June 1960. The predictors 
were scores on the GRf Advanced Psychology Test and the GRE 
Aptitude Test. The criterion was success in graduate study^ defined 
as having earned the Ph.D. or berng still enrolled and rated by 
faculty members as outstanding or superior in the fall of 1963t The 
results are shown in Table 26. 



Table 26: ValldKles for 26 Psychology Students 









Standard 






M«ao 


Oaviatton or 




Correlation 


performance 


POrlOrmanc* 




wiih 


on 


on 




Criterion 


Pr«OiClor 


PradktOr 


QRC Advanced Paycholoay 


OS 


62t 


«6 


GRE Varbii Ability 


-27 


653 


70 


Gf4E0u«nmaiiva Abtliiy 


- 24 


597 


79 



Table 27: Validities for 31 Students In Psychology 
at New York University 





Cntena 


PeriOfa.errc* 
on Pradiclor» 




p^ D 


Parcantag^ 




Standard 


Predictor* 


Attairrm«nt^ 


ot A 9 


Meen 


Deviation 


ORE Advanced Paychology- 


66 


44 






GREVartHl A&IUty> 


17 


- 01 




72 


QBE OuanutiWv* Awi^ty^ 


.22 


J7 


107 


84 


MAT 


07 


- 2a 


39 


97 


Dv«r«n UQPA 


to 


tt 


293 


4T 


P>y«ho»Oflfy UGpA 


04 


26 


334 


49 


Numt^ar of Undtrgraduate 










Pav^ofooy Cour^M 


32 


02 


7 33 




Pttrf Ormafloe on Cr^tana 




M«an 

Standard Dn^aliOr 


9S2 
3at6 


SI 

33 64 



*ORE Scores ara nattier ra^ nOr s^ied scoras 
^RM94ving Ph O - 1. rtot receiving Ph.O -0 



8- The Subjects ot a study by Ewen 0969) were 31 menenrolied in 
. osychotogy at New York UnWersity no earlier than the tall of 1960. 
The predictors were scores on the GRE Advanced Psycho(ogy Testt 
tl4e GRE Aptitude Test. and. the Miher Analogies Test, overall 
undergraduate grade-point average^ psychology undergraduale 
gride-point average^ and number ot undergraduate psychology 
courses taken. The criteria were attainment o\ the Ph.D. and 
percentage ot "A" grades in graduateschooLOt the 31 subjects. 16 
earned the Ph.D. and 1^ were dropped or wilhdrew. The zero-order 
correlations between predictors and criteria are shown iri Table 27. 

9. The subjects ot a study reported on by Hackman. Wiggins, and 
Bass (1970) were 42 students who began doctoral work in 
psychoJogy at the University of JJJfnois in 1963/ The precfictors in- 
cluded scores on Ihe GRE Advanced Psychology Test and tt>eGRE 
Aptitude Test, the number of languages spoken, the number of lan- 
guges read, the number of semester hours of language taken, 
as an undergraduate, undergraduate grade-point leverage in all 
courses in the junior and senior yearst and the quality ot the under- 
graduate institution as judged by a faculty committee. The criteria 
at the end of the first year of graduate study included grades, two 
student assessmenis ot their own progress, and two faculty judg- 
menls of sludent progress. The first studeni assessment deall wrih 
how rapidly they thought they were progressing toward ^ doc- 
torate, and the second dealt wilh whether or not they planned to 
continue graduate study at MlinoiSt The first faculty rating was made 
by teaci^ers who had had the students in a course, the second was 
made by heads ot departmental divisions. Six years after beginnmg 
graduate work all students had either earned a Ph.Dt or had wilh- 
drawnt All students were rated by faculty members on a 9-point 
scale of success to provide a long-term criterion. The results are 
shown in Table 28. 



Table 28: Correlatlona Between Predictors and Criteria for 
42 Students In Psychology at the University of Illinois 





Cri1«na 




EndOtftrsi V«ar 


A1t«r 




SiM<]*ril $«tf A490um»at 


Faculty l^atinga 


SY«an 






3p«*d to 


Ptan» to 




0»pt. 


SucC0» 


Pr»diCtor» 


Oradn 


0*gr«« 


Continue 


Taactian 


Heads' 


Rating 


ORE Advanced PayChOloQv 


.28 


.23 


.16 




t2 


-.11 


QRc V»rtu)AbkUtv 


.2£ 


.45 ■ 


.23 


.2t 


.20 


19 


ORE Ou«nlil«tlv« Ability 


IS 




03 


.29 


,23 


.32' 


No. L«nguaQ*s Spok«^ 


04 


-.04 


' 39^ 


-04 


.10 


-21 


No. LAnguiQOt R«ad 


.19 


.07 




Ot 


.20 


-.25 


HoursLVH)uAg» Study 


■06 


- 20 


- 25 


- 03 


-.02 




UGPA La»i 2 y««rs 


.28 


.02 


-.04 


-.08 


.05 


-22 


Ou«litv Ql UndWi)r«<fuai1« 














institution 


IS 


.30' 


te 


31' 


06 


43^ 



■Significant at ttie .05 l«val. 



I0t A group reported on by Newman (1966) was composed of 66 
graduate students studying for advanced degrees in llie Depart* 
ment of Psychology at Washington State University. The predictors 
were scores on the GRE Advanced Psychology Test and the GRE 
Aptitude Test for 27 students. For the remaining 39 sttJdents the 
predictors were only the two Aptitude Test scores^ The criterion 
was graduate^grade-point average. The results are shown in Table 
29. 



98 



103 



Table 29: ValldltiM for SS Ptychology Students at 
Wathlngton State Unlvertlty 





CorrtlifttCn 

With 


M«an 

P«rformin» 

on 
Ptw6iC\of 


Pedomincfl on 
Prw6iC\0r 


ORE Mv^C96 P»¥cnc*ooy 










09 


6or 


56 




OS 






ORE Qu«iltltitlvii AOiittf 


21' 




104 



'S4gnmc«nt Kt tti« OSI*vbI 



11. The sub>ects for a study b/Cr6ag6r{t963) were 99 applicants 
for National Science Foundation tenowships in 1935 and 1956. The 
predictors were scores on the ORE Aplitude Test (verbal and quan- 
titative} and the Advanced Psychoiogy Test. One critenOn was time 
lapse between attainment of the B.A, and the Ph.D.. coded as 
ahown below. 

fi.A.-Ph.D. Time Less No Ph.D. by 

Lap9e(in years): than 4 4 5 6 7 6 9 Aug. '64 
Coded Variable: 1 2 3 4 5 6 7 6 

A second criterion was the dichotomous variable of attaining or not 
attaining a Ph.D. by VKugust 1964. The third criterion was the 
dichotomous variable of attaining or not attaining a Ph.D. in the 
average time taken to attain a Ph.D. in the field. The relationships 
between predjctors and criteria are shown in Table 30. 



Table 30: Validities of GftE against Doctorate Attainment 
for 99 Applicants for National Science Foundation 
Fellowship fn Psychology In 1955 and 1956 





Critefia 


a. A .-Ph.D. 
Tim* tdip««' 


Ph 0 by 1964 


0. in 
AvAr»g«TimB 


PCini 

aiMriAl Qtt^rW 


Point 


OnEV«fb«lAUlilv 


.13 




.17 .24 


ORE Qu»ntit«livi Abililv 


17 


13 16 


.16 22 




.33 


2S 34 


.30 .42 


Conipo««t» 


.34 


K . 31 


.30 .42 



^rralBtiOflft bfltw««n iti« COdXJ vanabis tor a.A .Ph D tim« lapM Oiven abOv« and 
thi CKMJiCtOO wilh tha Signa («versad 



12. The group included in a study by Mehrabian (196Q) was com- 
posed of 79 students enrolled in the graduate psychology program 
at the University of Calilornia. Los Angeles. The predictors ware 
scores on the ORE Advanced Psychology Test, the GRE Aplitude 
Test, and the Miller, Analogies Test, twerall undergraduate grade- 
point average. UGPA in the last two undergraduate years, number 
ol undergraduate courses »n mathematics and logic taken, faculty 
ratipgs oft^romlse as a student end researcher^ laculty rating of re- 
search orientation, admissions committee eveluatiOn of promise, 
and admfssions committee index of acceptability. 

The criteria were faculty rating of graduate achievement, average 
grade in all first-year content courses, and average grade in all first- 
year statistics courses. The results are shown In Tabfe 31. 



Table 31: Correlations Between Predictors 

and Critesrla for 79 Psychology Students 
at the University of CalHomJar Los Angeles 





Crilafia 




FacvltVRai^ng 


Av»raee 


Avef^* 




Oraduata 


OrAde in 


OrAda in 


Pr9<3ictar« 


AChi«v«fnent 


Cgntant 


Statlftllca 


GRE A<ftr«nc«d Psy^hoJogy 


.48* 


,53' 


,6t' 


GnEV«rbalAtriiitv 


-U 


.17 


.12 


GRE Otjantil«tiv«Abtlitv 


IS 


.27^ 


.*B' 


MAT 


.19 


.34' 


.33^ 


Ov«ratl UGPA 


.06 


.10 


.OS 


U0PAU»t2Y«ar> 


.13 


.21 


.u 


Mufnbar of Math*m«tlc9 ahd 








Logic CgurtM 


.IS 


11 




Faculty Rating cH Promisa 


25' 


.2S> 


.36^ 


Faculty Rating ot R«»«arch 








Ori«ntatiOn 


19 


,3r 


,06 


A(tmi3sion3 Cprnmitta* Evalu- 








ation o< PromiM 


.10 




-33^ 


Adm49»iOn» CofhonittM Indav 








or Accaptabilrty 


.30> 


.21 


.20 



■Corralation co«Hicl«ht9 «bov« .32«r» signiticam al tha OS i*v«l. 



99 



ERIC 



ADVANCED SOCIDLOGY TEST 



Content 

The qoeatlons in ihd teat are drawn from the courses ot study most 
commcrfly ottered in college curnculums. A few examplos o' 
courte titles are theory* collective behavior* social Instiiutions, 
inirodiictory statlatica* urban sociology^ dentography* hu^^*^ acoh 
ogy. aocfal siructura and personaJity* criminology and juvenlLe de- 
linquency, public opinion and propaganda, research methods, and 
the logic of sociological induiry. The tasi aims at a balance among 
the many subfialds ot sociology* and the questions are dislribuied 
among the following areas: 



PER- PER' 
AREA CENT AREA CENT 



Methodology and 




Theory 


6 


statiaiics 


15 


Slratification 


5 


Social psychology 


9 


Comparative sociology 


4 


Race and eihnic 




Occupations and 




relations 


8 


professions 


4 


Social Changs 


8 


Political sociology 


4 


Complex Organization 


6 


Social Organization 


4 


Demography 


6 


. tJrban/rural sociology 


4 


Davjanceand soctat 




Collective behavior 




control 


6 


Human ecology 


2 


Marriage and ihe family 


6 


Religion 


1 



The coverage of methodology and statistics in the test is actually 
greater than f5 percent, because a number of questions in the 
other areas enumerated above also require methodological or 
statistical skills. 

Recall of specific information is required to answer many of the 
, questions. The test, however, does not merely measure, factual 
knowledge a^euch but instead draws upon such knowledge to test 
for ability to interpret the types of data typically encountered by so* 
ciologistB and for an understanding of relationships. 



Responses to Background Queationai 1970-71 
(N =1,739) 
A. At what point are you in your studies? 

3% (t) t am in or have just completed my junior year of under- 
graduate study. 
67% (2) I am an undergraduate.seaiof, 

18% (3) I have a bachelor's degree but am not presently enrolled 

in graduate school 
6% (4) I am in or have just completed my first year of graduate 
study. 

4% I am in or have oompJeted my second year of graduate 
study. 



B, What graduate degrae do you intand toseek? 

5% (1) J do not pJan to pursue graduate atudy, 

3% {2) I plan to pursue graduate work but not to obtain a 

graduate degree, 
46% (3) I plan to obtain lermtnat M.A.»M.S.,orotherdagreeat the 

master s level 

26% (4) I plan to obtain M.A.. M.S.. or othar nwstar's level degree 
leading to a doctoral degree. 
(5) I Plan to obtain Ph.D., £d.D.. or othar degree at the doc- 
toral level 

C, What was your undergraduate major field? 

63% {1) Soclologyonly 

17% (2) Sociologyandanothersocialscience 

5% (3) Sociology and the humanitlas 

2% (4J Sociology and nlathematrcsor^ natural science 

12% (5) Other 

D, ]f your major was sociology, what was your undergraduate 
mnor field? 

36% (1) Another social science 

S% (2) Education 

4% {3) Science or mathamatips 

12% (4) Humanities 

?1% (5) Other 

Hava you had a course in contemporary sociological theory 

and/or a course in ihe history of social theory? 
« 

33% (1) No 

12% (2) Yes. a course in contemporary theory. - 
28% (3) Yes, a course in history of social theory 
17% (4) Yes, a course combining contemporary theory and the 

history of social theory 
9% (5) Yes, a course in contemporary theory and a course in the 

history of social theory 

F. What is the highest level mathenrtatics course you have taken 
in college? 

36% (1) No mathematicscourse in college 

36% (2) Algebra and trigonometry 

16% (3) £lerrentarycalcutus 

3% {4) Advanced calculus 

1% {5) Courses beyond advanced calculus 

Have you had a course in statistics? 

64% (1) Yes 
35% f2) No 

Hh Have you had a course in race and/or eihnic relations? 

51% (1) Yes 
4d% {2} No 

I Haveyouhadacoursein demography or population? 

22% (1) Yes 
76% (2) No 



100 ' , y 



V»Udlty Data 



Roberts (1970) attjdi^ the records of 24 students who had enroJPed 
at Wake Forest University from June 1964 to June 1970 for graduate 
worK in sociology aWd anthropology and who had completed at 
least nine hours of graduate work. The correlations between 
graduate grade-point averages and GRE scores were. .16 for verbal 
ability, J7 for quamitettve ability, and .69 for the Advanced Test. 
Presumably the Advanced Test was In sociology in most instances. 



ADVANCED SPANISH TEST 



Content 

In determining the content of the Advanced Spanish Test, the com- 
mittee of examiners must take into account (t> the diversity of sub- 
ject matter and emphasis in undergraduate cuTrtcufums: (2) the 
areas of speciaiization most likely to be entered upon by graduate 
students: and (3) the abilities most relevant to the tasks graduate 
students are likaly to encounter Accordingly, the test contains 
- questions in the following broad areas. 

Language PfOfttJancy and Knowladge. A certain number of ques- 
tions focus directly on the student s mastery of correct structure 
and usage. A few questions in the field of descriptive and structural 
linguistics are also included. However, since the entire test is in 
Spanish. aM Questions basically involve the studeofs knowledge of 
the language^ 

The structure and usage questions will give some indication of a 
student's ability to write acceptable Spanish: oral skills, however, 
are not measured in the framevvork' of this test. Proof of students' 
proficiency in speaking Spanish and in understanding spoken 
Spanish wWU therefore, have to be obtained in other ways. 

Literary History and Theory. Questions in this area test the 
student s familiarity with those Spanish and Spanish American 
authors whose works are moet likely to be studied by under- 
graduate majors in Spanish. Although some questions are limited 
to factual recall, others prot>e deeper to gauge understanding of 
literary trends and ideas. Because of the diversity of undergraduate 
literature courses and reading programs* however, it is unlikely that 
any students will have read all the works represented in the test 
and. consequently, that they will be able to answer all questions in 
this area. 

Other questions test famiharity with basic concepts and terms of 
literary theory. 

LKarary Interpretation and Insight. The ability to comprehend the 
meaning of literary works fully and to interpret them with sensitivity 
ar>d inalght is of particular importance to students preparing for an 
advanced degree in Spanish or Spanish American Nterature. A 
number of questions in the test give them an opportunity to 
demonstrate the skills they have acquired during their under- 
graduate studies and their aptitude in the area of literary interpreta- 
tion. These questions deal with aspects of meaning or form in 
literary selections^ 



Culture and Civilization* The committee feels that someknOwledge 
and understanding ot Spanish and Spanish American culture and 
civilization is essential for the student entertog upon graduate 
study. Accordingly* the test contains a certain number ol questions 
touching on major aspects of history* gaographyp institutions* cus- 
toms* ideas* and theartsin the Hispanic world. 

Pn assigning relative weights to Peninsular and Spanish American 
subject matter* the committee bears in mind thegrowing attention 
given to Spanish America in undergraduate prograrrts. Even so* to 
reflect the reality of undergraduate curriculunoe* sonrtewhat more 
weight is allotted to Spain, As a general rule* structure and usage 
peculiar to any part of the Spanish -Speaking world are avoided un- 
less knowledge of them istobeexplicitly tested. 

The test yields a total score and the three subecores of 1) Inter- 
pretive Raading Skills: 2) Peninsular Topics: aod 3) Spanish 
American Topics. 



Responses to Background Questions, 1970-71 
(N = 1,739) 

■ " /, 

A. At what point are you jn your Studies? 

3% (1) I am in or have just completeo nior year of under- 
graduate study. 
60% (2) I am an undergraduate senior 

20% (3) I have a bachelor's degree but am not presently enrolled 

in graduate school. 
10% (4) I am In or have Just completed my first year of graduate 

etudy, 

6% {5) \ am in or have completed my second year of graduate 
study. 

B. What graduate degree do you intend to seek? 

3% fl) Idonotplantopursuegraduatestudy. 
3% (2) I plan to pursue graduate work but not to obtain a 
graduate degree. 
49% (3) I plan to obtain terminal M.A.. M.S.pOrotnerdegreeatthe 
master s lev SI. 

28% (4) I plan to obtain M. A.. M'S.. or other master s level degree 
leading to a doctoral degree. 

15% (5) I plan to obtain Ph,D.. Ed.D.* or other degree at the doc- 
toral level 

101 



C. Was Spanish feguldrly spoken in your home when you were a 
child? 

25% (1) Yes 
74% (2) No 

D. for what length of time have you studied in or lived in a 
Spanish'Speakmg country? 

27% (1) Nofatall 

21% (2) Some, but less than threemonths 

10% (3) ThreetOSixmonthS 

T3% (4) Six monlhs tooneyear 

28% (5) More than One year 

What is (Or was) your undergraduate maior field? 

81% (1) Spanish 
3% (2) Another foreign language 
14% (3) Other 



i\ If you majored i^i Spanish as an undergraduate, which Of the 
lollowing was most emphasized in yOUr courses? 
62% 0) Literature 
14% (2) Language proficiency 

5% (3J Civilization and culture (inctt/ding area slutf ies) 

Sft'o (4) Linguistics (history of (anguage^ structure Of language) 

5% (5) Other 

G. in your undergraduate literature and/or civilization courses, 
what was ttie relative emptiasis given to Spain and Spanish 
America? 

58% (1) GreateremPhasiswasgiven to Spain. 

12% (2) Greater emphasis was given to Spanish. America. 

25% (3) Spain and Spdr,^sh America received about equal em- 

phasis. 



References 



Creager, J, A. Predicting doctoratB attainment with Gflf and other 
variabtes (Technical Report No, 25). Washington, D.C: Office ol 
Scientffjc Personnel. NatronaJ Academy of Scrences— NationaJ 
Research Council. November 1965. 

Eckhoff. C. M. Predicting graduate success at Winona State 
College. Edtycaf/ona/ fl/Ttf Psyc/)o/og/ca/ jweasuremeni. 1966.26. 
483-485. 

Ewen. R. The Gre psychology test as an unobtrusive measure of 
motivation, Joarna/oMf>f>^'etfPsyc^)0/ogy. 1969.53, 383- 387. 

Hackman. R,. Wiggins. N.. & Bass. A, R. Prediction of long-term 
success in doctoral work in psychologiyn Educationat and 
Psychoiogical Maasorement. 1970.30. 365-374. 

Joh/)son. H . & Thompson. B The Graduate Record Examinations at 
Sacramento State Coiiege (Technical Bulletin No. 11. Student 
Personnel Division. Sacramento State College. 1962). Reporte<J 
by G, V. Lannholm in GRE S^jeciaJ Report 68-1. Princeton. N.J.: 
Educational Testing Service. t96a 

Lannholm. G. V.. Marco. G. L,. 4 Schrader. W. B, Coopera^Ve 
sfutf/es ot pre<iicting graduate schooi success (GRE Specml 
Report 68-3). Princeton, N J. ~ Educational Testing Service. 1968. 

Lorge. l^Retationshfp between Graduate Record Examinations and 
Teachers Co//ege. Cofumbta University, doctorat verbal examina- 
tions (Letter to G, V. Lannholm dated September 21. I960). 
Reported by G, V. Lannholm in GRE SpecraJ Report 6&-1. 
Princeton, N.J.: Educational Testing Service. 1968. 

Mehrabian. A. Undergraduate ability factors m relationship to 
graduate performance. Educationat and Psychoiogicai Measure* 
ment. 1969.29, 409-419. 

MicheJs, W. C. Graduate Record Examinations Advanced Physics 
Test a^ a predictor of performance. Amencan Journaf of Physics. 
1966,34 (9. Pt.2). 

102 



Newman. R. I GRE scores as predictors of GPA fOr psychology 
graduate students. £ducat/ona/ and Psyc^jo/ogica/Meastyj^e/nenfj 
1968.28. 433-43a 

Office of Educational Research. S/u<^y of GRE scores of geology 
sfudehrsmafWcu/af/ng in the years i952-i96i (RP- Abstract. Yale 
University, 1963). Reported by G. V. Lannholm <rt GRE Special 
Report 68-1. Princeton^ NJ.: Educational Testing Service. 1968. 

Roberts. P. T. An analysis of the relationship between Graduate 
Record Examination scores and styccess in the Grad\»ate. School 
of Vi/ake Forest University. Unpublished master s thesis, Wake 
Forest University. 1970. 

Rock. D. A. The prediction of doctorate attainment in psychology, 
mathematics, and chamistry (GRE Board Report 69-6). Princeton, 
N.J.: Educational Testing Service, 1972. 

Roscoe. J, T„ Si Houston. S. R. The predictive'validiiy of GRE scores 
for . a doctoral program in education. £ducaf/ona/ and 
Psychotog ca/ Meesu/emenf . 1969. 29^ 507-509. 

Sistrunki F. The GREs as predictors ol grad\»ate sc/)00/ styccess in 
psychology (Letter to G. V. Lannholm dated October 3. 1961). 
Reporte<l t>y G. V. Lannhoioi in GRE Special Report 66-1. 
Princeton. NJ.: Educational Testing Service. 1966. 

Voorhees. H. R* fiefationship between sco/es on Graduate Record 
Examinations and graduate schooi performance in physics. 
Unpublished manuscript cited by G. V. Lannholm Irt GRE Special 
Report 60-3. Princeton, N.J,: £ducatio/?al Testrng Service. 1960. 

Williams. J. D.. Harlow. S. D., Si Grab. D. A longitudinal study 
examining prediction of doctoral success; Grade point av^arage 
as Criterion, or graduation vs. nongraduation as criterion. Journal 
of Educational Research, December 1970.64, 161-164, 



' 110 



INDEX 



A<tvaaCAd T6S19 2, 24 21 

S«# Sit^ SptCMlC «UbjKt9. 

AftMmbly 01. A. 29 

co<nmMta«i of •x>m^n«ra, 10. 3^27. 34 
,conl»nt well lea Hon» ol. 27 
cOrraiationa among BubtM:orU. 30 
corr«latlon> with APUIttdv T«»1 9cor«fi. 30-31 
etiitit tor i)4voloplno. 2^ 
devoiopmVAt o'. 2f 31 
oguatlOQ o(. 34^ 
AkAfMn^v voiumo in* 48 
lormai ol. 3^26 

gaMral chMractariatics of. 4-6 
ll*m ■naiyai« ol, 7' 39-41 
proiwlin^ OK7.29 
purpoaa ol^ 4 
qu»JMy cOAiroi ot. 7 
raliab^my of. 26^29. 39 
rMcaljOQ a (tidy, 30-39 
r^^\fM of. 5^ 

^A)«d-«cor« ayat«m. 32-33 

aiab^llly ol 37^36 
9pAd*df>«ftft, 29 
atat^llGal sp«cl(lGahon« and 

ctiar*cittftftl4cft o(. 25-31 
sob»corv», 2»-30h 43. *7 
tMt antiyMaor. 4i46 
tavta avafrvbl^H Ital. 3i 
ttMa of. 24-25 
vatidur of. t2-^* ftS. 

^'a affo »p«<:Mic aub|«cia 

Aitarnata rorm». 22 
A/fmJfl, ft. A, Si 
AnaloQlasH 10. ii 
AnaJysis Cit A^pltnAlkOna, lfi^l7 
Analytical ablhiy meaattra. iO. 16 i&. \9. 32-23. 3:!, 
33-^^. 54, 64*7 

analyala of «kplar>atlona. le-i 7 

analytical r^aacfi^nO. 18 

cor>l«nl apACiticatiOn^ or. ^(i 

de>f«foPm<r>t ol. 10 

tormat of, 10 ' 

logical cflagrama. 17 

quaattOA typaa nor aaiaciad lor uao. 64^67 

rwl lability oi. tg 

acahr>(] or, 33.34 

validity Dt 22-23 
Aruiyl^ai roasor^ing. \B , 
Ango/^ W H.. 33, ftl ^ 
Afltor>vn^9, 1 1 
API Kudo Teat. 0-23 

$*9 a/ao Analytical aDiiity maaat^re, 
Ouaniitat^v« abihiy ma^sur^^ verbal ^Oihity 
nMaagpa. 

aaaambiy of. 0 

changaa 0-iO 

contam characlanaiica ol. iO 

cOf/afatiOA9H Mtarna^ t&-l9 

corratariona wim Advancad Taat scor«9 3^ 31 

deacrlotiv* »iau»tkC9. 48^? 

d«voroPrrwxt ol^ 5-6. 9-10 

aquaiing ot^ 34-38 

fof ntJl of, 10 

rormuta acorkng. 4-5 

garwal cliaractartatkc^ or. 4 S 

itam analysis. 39-41 

pr«la«tlflg. 7, 10. 22 

purpo^ of. 4. 9 

quaitly control of. 7 

rwliatHiity or 16^19. 39 

raacaling atudv. 39 

raatruc<ufad. 10. &3-54 

raatruclurlfig. raaaarc^ on. 21-23 

Mmpta Apiitude T«at, 7, 21 
^alad acoro ayaiam. 32-34. 37 38 
sp<cMlcalk>n«« raaaafcti and atatiattcai 

analyata in. 20-21 
sp»*d*dn««J. 11^20 
»tallAtic«l characlarlsllca. 10-20, 39 

apACittcaiiona, 20 
ta«l analyata. 41^ 
•^ahdity of. S2*55. 57^ 
Aaiociatlofi o' Ama^lcan UnWaf9lti«a. 1 
Aaaociallon of Grtcftiaia Schooia Commkttea 01 

Taa|lr>o, 1, 2 
Attanua1lDr>. coriactior^ lor. 44 
AOlits. L £.. 61 



8achgr0ur>d Quaauona. 27. 4B^9. ft3 

daaa. A ft.. b1.9B 
Saacft ft. O, 01 
' eioiogy TMl. Advanced. 68-69 
conlar^t. ^ 

r«apon9«a to bAChground quastions. ^ 60 

validtiy data, 09 
Biaaiial correlation. 7. 40-42,^5 
Borg. tV ft., 61 
BractitO 50 
BraWkng ror ac«l< alabililv. 37 
Br^iand. H M., 59 
Btogtt^n. H B,^7 

Ctmpt?tlf. D. r, 54 
Capps.'JW. P., 61 
Carlson. A 5. 20. 53 

Carnagia Foundation lor ln< Advancemar^l 

of TaacMng, 1 
Crvemlairv Taal. Advanced, 70^7^ 
conlaal* 70 

reaponsaa <o bachgrouftd qu^afiona, 70 

vaikdity daia. 71 
Cn«n, C. C . 46 
CmiK N. 61 
Cfaa/y, r A.. 59 
Coaching aluctlea. 20-2f 
Cor/man. IV £., t2 

Coiiago Enlranca Elimination Board. 2. 20 

ComparaDillty at acur«». 9. 10. 32. 34. 30. 38-39 

CoiTiputar Sciarua Taai, Advaocod. 72 
' Conatruct validllV, 52-54 

Conlant v airily. 52^3 

Con**)^, Srtf#f 4* 7,61 

Cooperallva Graduate Teaikr>g Program. 1 

Cornell Teal ot Critical Thinking, 22 

Correlation 
or Advanced Teal agbacorea. 30 
or Apmude Te«i acore». ia-i9, 30-31. 
■n leai ar>a»yafs report. 44-45 

Counckt or Graduare Sotioora. 2 

C/a4paf, J. 4, 60-57. 61,69. 71. 77.81.63.93.99 

CriteriDn ra^aiad vaiidiry. 52. 54 56 

CnieriOn acore lOr item analyaia. 40-41 

Cron&acft, L J. 55,57,56 

Data inierpr«iation qu«ation&. \* ^b 
Da^$. ft or 
Oecoa/a, F A . ^^ 
Deductive reaaoning. 60. 07 
Delta acaie. *0 
equating. 41 
Descriptive aiatiatics, 45-51 
Diacreie mair>emarjc& questions. 14.15 
Ot9itS9K P i, . 39. 43 
Oiitt. f t . 61 

SzkhQfL C ar. 75 
Economics T«sr. Advanced, 73-74 
content. 73 

reapoi^e^ i<^ background Queaiions, 73^74 
EducaikDr> Teal. Advanced* 74.70 
conlenr. 74 

responses to background Oueattonan 74-75 

vaiid^y daia. 75^70 
ElNci«ncv a i^st, 42. 44 
Ertginoering Has I. Advanced. 76-77 

cOrtient. 76*77 

reapoo^es ro bachgrourid Queatmns, 77 
validity dala. 77 
Equating, 34^. 41 
Advanced Teats. 35*36 
Aptitude Teatn 35 
common iteti, 35 
delta, 41 

dauble.pan.acore. 37 

Levme equatior^s. 35-38 

m^moda. 3^-36 

Tucker eouatii^na. 35 
Error of rr^easufsmant. atandard. 19. 29. 39, 43. 46 

Advanced Teats. 29 

Aptitude T«4i, t9 

erior variance. 43 

raw acore, 43 

scaled scor«H 43 
Evaluation of evidence. 00 
fvana. f. ft. 2t 
fwea R B.. Ot. 96 



Factor analyaiSn 20. 53-54h 00 

Arivanced Psj^ctioiogy Teat, 96 

Aptitude leatn 20. 53 

conatruct vatldityn 53*54 

spsededrtese, 20 
Ftitimtft, J A . 50 
FttM. D. W.. 54 
Fti>riffA Statu UnhtrtUV. 0 1 
Fwa. S. 59.80 
Formula scoring. 4.5 
French Factor Kit, 22 
French Teal. Advanced. 42*47. 48 

con ten r. 75 

rosponaea te background Qu«aliona, 78 
teat anaiyaie, 42.47 

Oeograpr^y Taat. Advanced. 33, 38. 79 
extent. 79 

re»pOnaes to backgrourvd Qoeatlons. 78 
acaling. 33 

acaitng ol auba<orwa. 30 
Geology Teal. Adverted. 8081 
content. BO 

reaponeee to bacKgrourid doealkona, 80 
vaH4tiiydata.eO>6i 
German Tmi. Advanced. 4. 25, 23. 82 
contetit* 82 
tormat. 25 
rofmule scoring, 4 

reaponaea to bactiground Questkona. 82 

acalingn 33 
Ombons, 9. D.H 61 
0f99i. 0. v.. 66 
Ooa v. 61 
Ofab. a. 62. 78 

Graduate Management Admkaaion Teat Program. 
20. 22 

Graduate Record Examlnatlona Board* 2.3h 2i. 60 

Reaearcti Committer. 21 
Guesalng. 4. 5.2T 

correction tor. 4. 5 

reaearoti on, 2i 

test inatroctlona. * 

Hackman. J. a, 61. 90 
Hanten, W. L. 61 
Hmttow. S. 02. 76 
HSf^ay. P^ R. 59. 61 
High Level Math Uaage Test. iO 
HittQ/h T. L. 59 
Hkstorv Test, Advanced, 83-84 
content, 03 

reaponaea to background dueatlorks. 83-64 

validity data. 84 
Hatfand. R W, 5i ^^ 
Homogeneity. 26^ 41. 43. 53 

item anatyaia criterion, 4i 

rBliablltly. ^ttect on. 43 
Wouaron. 5. ft,. 62h 75 
Homphrait. 59 

Independent Student Testing Program, 1 
tnstlttttional ToSting Program. 1 
InterrorrelaHon. 41.43. 46^ 53. 56 

in test analysia reriprt, 41.43 
item anajysts. 10> 20. 39*41 
Item dtiriculiy index (deilnl. 29. 40. 4i 
Item POOL 6, 40 

JsciiiQn. V^, 61 
Jo^oaon. K, 64. 88 
Jonaa. ft. A.. Gi 

Kandfk)!* S. A., 59 
King. 0. C, 01 
KOd€f. 0. F.. 39 

Kuder-Rfc^ardaon lormtifa (20J 'or reriabirity. 39h 43 

LanntiOitn. Q. V.. 56. 61, 84^ 86.91. 92. 97'90 
law. A.. 61 

Law Scliool Admiaskon Test Program. 20. 22 
Lettar aeis. 04*66 
Lavtrta^ ft . 35*38 " 
Utrn, A. t., 59 

Lttaratttra In Ermilsh Teat* Advancad. 85^ 
content, SS 

rssponaes to background Questlona. 65*6^ 

validity data^OO 
Logbcat diagrams. 1 7-16 
Logical reasoning laaL 10 
torga, v.. 61,97 



ERIC 



I 



103 



>rtikdhr data. 68 
M«ari Acore. S«« Surmnory statistics 
M*dtan. 42. A\ *7 
M^tiftbitn. A. 61.99 
MvtitCk^ S . 56 
Mtcrtft. W. 61 
Mtctfts. W C , 92 

Muitfpr»-crioice formal of test Qu^sncns. 

10-17. 25-26 
Music T«3I. Advancsd- B9-90 
content, 09 

rflsponses to backOroUfid Questions. 69-90 

N4lfOnM Program tor Graduate Scnooi Seiectior^, 1 
NAlionftt Sct«rtc4 Foundaiton Graduai« ^^iicwsnip 

Program. ^7, 36 
S*wmt/t, ri f. 01. 98 
N*tJ9*/>r W., 62 

Normatwtf data See D»acrip1ive statistics 

Otftc^at EtftiCatiOriAt nese^rcn. 61, 60 
Owen, M . 61 

ParaHfl^ tOfnr^s a1 A last. ^* 

Percanhie ranks, 46-46 
Phiiosopti^ Test. Advanced, 90-91 
contanl. 90 

ftfspoovis lo back^rounc} Quaslions 90-91 
vaMity da^a, 91 
Physics Test. Advanced, 92-93 
corttani, 92 

resporisoa lo background Que si ions. 9 J 

vahdtiy data. 92 9^ 
Pik&. L Vif, 2i 
Pttctttf. fl.. 59 

Political Sconce Tasl. Advanced. 91h9& 
content. 94 

responaaa lo background questions. 94 95 
PotiutAttt>r% vaJid^ty. 5^-60 
Power test. 19, 29 ^ 
Powa/s. Q C . 20h 53. 59 
Predictability o1 0'^^"'^^ success. 56 58 
Predictive validity. 5s-5B 
Praltmrnary item aruiysiS. 7, 10 
PfeiestmOn 7. 40 Saa avsa APiitude Tesi 
Prof tie Tests, 1 

PsyctiologY Test. Advanced. 95-99 
cor\ten1. 95 

research related to additional ^ubscoras. 9S96 
responses to background Questions. 96-97 
vahdity data. 97 99 
PtjiitAt. C. 61 

Qualtly control. 7 

item analysis, 40 

test analysts. 41-45 
Quanttlativa abihty measure. 9. 10. 14 16. i6-22, 
57. 5fl 

ctiar^ges in. 9 



Quantilalive ability measure iconlpnoedl 
cOritent Ol, 14.16 
conlani specifications. I6 
descriptive slalishcs. 45-51 
tormatn 10 
feiiabihiy. ta-i9 
leslruclurtng researchn 2i 22 
typesol qVQsllons. 14 16 

daia iniarpretatiOn. 14j5 

diSCrale maihemaiics. U'15 

Quanlitaiive companions. t5 
Ouantiiative comparisons, 15 

Reading comprahansiork sels. 9, 12' 1 3 
Ranabtiity, i6 20. 22, 26-30, 3S36. 39, J i. 43 

Advanced Tests. 28^30 

Aptitude Test, 16-20 

eMact or* eqoalmft, 3S36 

index ot, 39 

KR 20 for.^uia, 39. 43 

methods of determining. 39 

relDiion to standard error of rr^easurem^ntr 43 
Reseating study. 36^39 
Afchardson, M 39 
ftobeftSr P T, 61. 69. 71.64. 66. 66. 93. 101 

pObiPion, 0 62 
floOr. 0 A, 56, 62. 71. 88, 97 
Aoscaa. J T. 62. ^5 
flosen/erer, h. 62 



SaC/a/tienio Srafa dtUt^e^ Tesr Of/fce, 62 

Sampto Aptitude Test. 7. 21 

Scaled score system. 32-34. 36. 37-39 

3tab<i<ty ot. 37 '29 
Scfioiastic Aptitude Test. 20, 2 1 
Scmade/, W 0 . 61. 8*. 66. 91, 93. 97,96 
SCftutn. M. K , 33 
Score scale, stabmty of. ^37 
Seritence completion <]t>esuons. i ^.12 
SMttef. Jr 62 
S^sffufl*, f .. 62. 9? 
Skewness. dz 
Sia*P*f. M. ^ , 62 

Sociology Test. Advanced. tOO^tOi . 
content. 100 

responses to bac>^grOund Questions. 10Q 
validity data. tOl 
Spanish Test. Advanced. 25, t0i-tO2 
contefit, tOt 
format. 25 

responses to baci^ground auestioni^ tOM02 
Speededness^ 26, 29. a3-^4 
Spiraling o1 test fornr^s for eQuating, 35 
Stability ot tVia scale^ 36-37 
Standard dev^atiort. See Summary stahsti^s 
Siandafd error Of maasurement. 26. 39 
Sfandardifakon group, t952, 32^33 
SfanW. J. C, 59 
Stahsttcai methods. 32 51 

descnptrve statistics. 45-5i 

equating, 34-36 

error o1 maasuroment. standard. 39 
item ar>DlySiS, 39-31 



Statistical n^ethods (conimued^ 

retmbmty, 39 

re scaling study. 36-39 

scaled-score system. 32-34. 37.^^6 

skawness. ^2 

speeded ness. 26. 43.44 

sPiraiing (for equating), 35 

sfabjmy oi iha scaie, 36 37 

Subscore scaling. 36 

test a/tatysls,4t 46 
Subscores, 29-30, 36 
Summary statistics 

Advanced Tesls. Su^scores, 29. ^3. ^7 

Advanced Tests: total scores^ 28, 33, 3r. 38., d2. 43 

Aptitude Test. 37. 36. 46, 49, 50. 5i. 58 
St^nafard' ^ ^ 39. 59 
S«vjnfon, S S., 20, 53 

Test analysis. 7, 41^46 
Test assembly, 6 

Test development procedures. 5-6 
Test development staff. 5 
Test matructions. A 
Testing standards. 7^ 
Tests ol General Education. 1 
Ihomp^an, f ,, 64. 88 
Ju^LH^f, I R., 33- 35-36 
JiiUf. Q. f .. 62 

Undergraduate Assessment Prog/am. 2 
ly/iiirersriy at ^^jrgjma. Otftc^ at 
/nsfift/fjonaf Ana/ysis. 62 

Validity. 52-63 

constfUct. 52-54 

content. 52.^ 

cnlerion-f elated. 52. 54-58 

definition 0I, 52 

popufaliort. 56'60 

predictive. 5462 
Varbai ability measure. ^0-i4, 21, 55. 57, 56 

contenl apecMi^ations, 13-14 

discrete >erbai questions 
analogies- ^1 
antonyms. 1 1 

sentence completion. 11-12 
evolution ot- 9 
lormai of. to 

read'ng comprehansfon sets. 12. 13 
reliabitily. l6-l9 
vbOf/wes. W fl , 92 



WaHace. A 0, 62 
WiiimAfif, M M, 38 
Wa/s/j, J. J „ 61 

Watson.GlaSar Test of Gntical Thinking. 22 

W^sman, A., 59 

mtte^ B L. 62 

White, G W.. 62 

Wjgg^fls. Af., 6t. 98 

Wj/tf, C. L. 59 

WiUtartti. J 0., 62. 76 

Wttltngttxm. w W. 52n»58 

Wj/soft. K. M.. 55 



104 



